We will be basically going through a systematic procedure that will help us to develop a means of troubleshooting the switch networks, helping us to identify possible port connectivity problems to helping identify VLAN and trunking problems or even to help spotting VTP or spanning tree issues through this systematic procedure.
Troubleshooting a switching network is not an exact science; rather it is an art farm. You tend to have to have an intuition of which part of the network might be having a problem. Here are some general suggestions to make troubleshooting more effective.
Firstly you must have a good understanding of what are some of the switch operations that you have configured on your switch. To troubleshoot your switching network, you have to know what services you have enabled on your network switches. Secondly, we need to have an accurate physical and logical diagram of our switching network, so as to give us a clear visualization of how things are interconnected physically as well as logically, so as to help us trace the network path. And lastly, when troubleshooting, we should have a plan rather than jumping all over the place.
Personally, I follow the OSI seven layer strategy. I would first go to level 1 and make sure that the physical link layer is working properly before I proceed to layer 2 to ensure that the link is working properly. And if the local link is working properly, then I proceed to ensure that this link communicates to other network links that are not directly connected. And lastly, we do not assume anything; we do not assume that any of the basic components are working correctly without testing them first, because some other administrator may have changed the configuration without updating the network documentation. And also, do not take users' feedback at face value, because sometimes, users can be giving misleading information.
Many years ago when I was troubleshooting a network, the user tells me that the network is not working, so I assume the link is down. It turns out that what he meant to say was the link is very slow. So having a switching network that is very slow and a switching network that is not connected requires different types of checking.
Troubleshooting Port Connectivity
One of the things that you can check when troubleshooting port connectivity is to ensure that the cable type used is correct. But this is a minor problem, because these days, most cables within the organization are typically Cat. 5e. It was only during that transitional period years ago where cat3 and cat5 cables were used at the same time within the organization, so that if you accidentally plugged in a cat3 into a 100 Mbps link, the cable would not be able to support a 100 Mbps connectivity. But these days, this is very rare. And also, using a length that does not exceed 100 meters. Again, this situation is pretty rare these days, because many organizations buy prefabricated cables. In the old days, we tended to fabricate our own cables, but these days, a lot organizations tend to buy tested, prefabricated cables.
Other things that we can check for port connectivity issues is to ensure that the port VLAN membership is correct. If you are assigned to the wrong VLAN, you'll never be able to talk to each other. We also have to check whether the user port has been shut down because of security services, like port security, or maybe some administrator just accidentally shut down the port. So even if the VLAN is correct, sometimes the port could have been shut down accidentally by the administrator, or deliberately because of a security breach. So we have to go to the port and use the show interface command to ensure that port up and running.
And lastly, we have to make sure that our duplex are in alignment. Now, if you have a duplex mismatch problem, you do not have a lost connection, but rather you have a very slow connection, because one side is assuming full duplex, which talks at the same time as it is receiving. It is sending data at the same time it is receiving data. Whereas a half-duplex connection practices CSMA/CD. Any time it detects a data transmission, it will back off and not send anything, thinking that it is a collision. So, whatever that was sent out by the full-duplex port is never received by the half-duplex port. And a half-duplex port will never send anything to the full-duplex receiving port because the half duplex is always detecting collision from the other full-duplex port. It is only during the quiet moments, when the full duplex is not speaking, that a half duplex can quickly send, or vice versa.
Troubleshooting VLANs and Trunks
Troubleshooting VLANs and trunking. One of the problems that we might encounter is native VLAN mismatches. Native VLAN is an untagged VLAN data frame in 802.1q trunking. By default, the entire switching network must agree on the same native VLAN. So, a common VLAN must be assigned this native VLAN privilege. But if there's a mismatch, VLAN 1in one switch may wind up being forwarded to the native VLAN 2 in another switch, causing misunderstanding.
Another possible trunking problem is trunk mode mismatches. Cisco catalyst switch uses dynamic trunking protocol to automatically negotiate a trunk link. But DTP sometimes has problems negotiating trunk services between an older Cisco switch and newer Cisco switches. So one of the recommendations is to statically configure a trunk link whenever possible. Or, as a compromise, you might want to configure trunking on one end and let the auto negotiation follow the hard coded trunking properties.
Data frame VLANs are forwarded within the VLAN by looking at the destination MAC address. But to facilitate inter-VLAN activities, we have to go through a router. In the same time the router does not see or recognize all these layer 2 properties. So, to allow the router to route from one VLAN to another VLAN, we associate each layer 2 VLAN with a unique layer 3 IP address, so that the router can route from one IP address to another unique IP address, which indirectly routes from one VLAN to another VLAN. So if two VLANs are given the same IP subnet, we will have trouble routing between these VLANs.
One thing to note when troubleshooting VTP is that, after we have configured the VLAN and VTP information, when we do show run, we will not see the VTP and VLAN information displayed on our show run because VLAN VTP information is stored in a separate file called the VLAN.dat file on the Cisco switch. So a show run will not display what VLAN and VTP we have configured. To view our configuration, we can only use the command show VTP status or show vlan.
Now troubleshooting VTP issues, if we insert a new switch into the network. If we accidentally insert a used switch into the Cisco switching network, which happens to have the same domain name as our live network, and by default, the switch is a server mode, and it has a lot of VLAN information, and a much higher revision number in the switch, what happens is that all the client switches in the live environment would start learning from the incorrect VLAN server switch. Now, the client switch would discard all the old VLAN information learned and just adopt all the new VLAN information from the new VTP server. But the problem is, this new VTP server may be the wrong server teaching wrong VLAN information to the clients. As, such all the existing users associated with the existing VLANs would no longer have any VLANs to associate with. This would cause all the ports to turn amber. Another similar situation is when you reboot your switch, and the client switch will lose all the VLAN database information, so that when it first boots up, it does not know what other VLANs it has except VLAN 1. So all the users that have been associated with a non-VLAN 1 VLAN will be inactive, in an amber state, because all the ports now do not have any VLAN to be associated with.
There are several reasons why VTP fails to exchange the VLAN information. Here are some possibilities that you can check out:
- Make sure that all the ports that interconnect switches are trunks ports because VTP only advertises through trunk links.
- Make sure that the server switches have all the VLANs required configured for VLAN information dissemination.
- Make sure that the switching environment has at least one VTP server to propagate VLAN information to the clients.
- When configuring your VTP domain name and password, take note that the domain name and password is case-sensitive.
- And, when you run VTP on the switching network, make sure that they all run the same VTP version.
- Verify the domain name and VTP version on the transparent switch.
- If you are using version 1 on a transparent switch, the transparent switch will only relay external VTP information through the transparent switch, between an external client and external server, if the external VTP uses the same domain name as the transparent switch.
- If you're running VTP version 2 on the transparent switch, the transparent switch will automatically rely any VTP signals that it hears, regardless of your domain name, to all the clients. So the external VTP domain name does not have to be the same as the transparent switch when running VTP version 2.
- We have to be aware that VTP does not propagate extended VLAN information.
Troubleshooting Spanning Tree
When troubleshooting spanning tree, one thing that we should have is a good network diagram that is able to reveal to us which switch has been elected as the root bridge, as well as which other links that are being blocked by the spanning tree protocol to create a loop-free environment.
You pray that you don't encounter a bridging loop, but if you do encounter a bridging loop, you will first notice that your broadcast utilization will be very, very high because there's an endless broadcast storm. Now, the second thing you notice, that eventually, your network will go slower, and slower, and slower, and eventually you might lose connectivity to the network because of so much looped traffic going around the network. And lastly, a personal observation is that if you look at the physical switches, you will see that all the ports are blinking very quickly in unison.
To stop the bridging loop quickly, one thing we can do is shut down all the ports on the core switching network, and then bring them up one by one to see if those link are the links that caused the bridging loop. We can also use a debugging command, such as debug spanning tree events, to give us more insight to what's happening in the switch that is encountering this bridging loop problem. To avoid spanning tree protocol problems, we should not let STP choose any switch to become the root, but rather, we should come in and manually influence which root would be appropriate to take on the role as the root device. And lastly, we ensure that our spanning tree protocol are using rapid spanning tree for faster convergence.