Cisco - Trunks Going Down

Spanning tree will not cross a layer-3 boundary, so it's likely that you have a root bridge at each location.

Spanning tree is pretty good at preventing forwarding loops, but if you don't properly select your root bridge then you can get some sub-optimal traffic patterns.

ASKER

So if fa01 is my root port at this site does that mean that my root bridge is either the top left switch or the bottom left switch? So should I change the trunk going from switch 1 (top right) to the bottom right switch to be coming from the bottom left switch instead or doesn't that matter?

SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

ASKER

Is that for VLAN 1 only? If so, the only switch that doesn't have root ports for VLAN 1 is the top left switch.

ASKER

Hopefully this will help. I think there is a loop that gets caused in there somewhere because it takes the network down until I shut and no shut the trunks or restart the switches.

All trunks are configured like this:
switchport trunk encapsulation dot1q
Switchport mode trunk

Based on that diagram, you can't have any forwarding loops.

Your SW2 is root bridge (as asavener already said - on root bridge all ports are designated ports). Did you check your interfaces to gather more information (also did you check logs)?
#sh interface Gi0/1
Check if there is something interesting there.

Is your interface getting error disabled?
If so, you can set timer for auto error recovery, but you need to investigate cause.
Next time before you shut/no shut interface check its status to see if interface was error-disabled
#sh interface gi0/1 status

Port    Name               Status       Vlan       Duplex  Speed Type
Gi0/1                      err-disabled 100          full   1000 1000BaseSX

Open in new window

How often are the ports becoming unresponsive?

ASKER

It happens once or twice a week. The last time it happened was yesterday, the time before that was exactly a week ago. But before that it was Thursday's or Friday's so it's not consistent on when it happens.

Predrag, I will make sure I check the port status next time it happens. I don't have to much time I can spend troubleshooting unfortunately because it takes down our call center.

What steps do you take for recovery?

ASKER

I either make a change to the trunk and write it or I just do a shut/no shut on the trunks and it comes back up.

SOLUTION

pgolding00

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

ASKER

The link to the ISP is trunked to a switch that is owned by us at the ISP and is directly connected to our core switch at our data center location.
The logs at the time of the incident show line protocol is down for the interfaces. I'll take a print screen next time so I can be certain how many and which ones are going down.

All of our devices are Cisco devices so that makes it easier.

Excellent info though and I will try that next time it happens. (Not that I want it to happen again) :-)

ASKER

This happened again this morning...so that makes 3 Tuesday mornings in a row.
There weren't any blocked ports or anything when I checked show span sum.

However, this is new in the logs from this morning.

%SPANTREE-5-ROOTCHANGE: Root Changed for vlan 30: New Root Port is FastEthernet0/45. New Root Mac Address is 001a.2ff1.6d80
%SPANTREE-5-ROOTCHANGE: Root Changed for vlan 5: New R oot Port is FastEthernet0/45. New Root Mac Address is 001a.2ff1.6d80
%LINK-3-UPDOWN: Interface GigabitEthernet0/1, changed state to up
%LINEPROTO-5-UPDOWN: Line protocol on Interface Gigabi tEthernet0/1, changed state to up
%SPANTREE-5-ROOTCHANGE: Root Changed for vlan 81: New Root Port is GigabitEthernet0/1. New Root Mac Address is 0006.d71b.e617
%SPANTREE-5-ROOTCHANGE: Root Changed for vlan 70: New Root Port is GigabitEthernet0/1. New Root Mac Address is 0006.d71b.e614
%SPANTREE-5-TOPOTRAP: Topology Change Trap for vlan 1

So I'm guessing after these new topology changes it's taking awhile to converge?
Why is the to root changing every week and what do I need to look for?

SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

ASKER

Another switch was put in over the weekend at that location. That's the reason for the topology change, so false alarm on that one.

Back to the original issue. I've been checking through VTP and STP because I'm still leaning that way for the cause of the issue. All of the switches there are in VTP Client Mode, would it be best to just put them in Transparent Mode?

That's a good one.
:)
But anyway, if you did not configure manualy rood bridge until now, you should do it.
It would be better if switches are in transparent mode (it is more secure), but if all switches are in client mode until there is no switch in server mode (which is default btw).

ASKER

I only have one switch in server mode and that's the core switch but I do see that the Configuration Revision Number is 48 on all switches, whether it's a client or a server. Should I reset the clients back to 0?

NO, that's OK, revision number will always be the same on all devices that are not in transparent mode.

ASKER

Another question.... I've been just trying to find anything different on these switches to narrow things down. I noticed that vlan 1 is shut down on 4 of the 5 switches. The one that it's not shut down on is the switch that has the uplink to the ISP. I've heard this can cause loops if it's carrying traffic on vlan 1 on some switches but not others. Is this true?

Have in mind, that loops, are possible only if you have two different paths to the same location in your network that can cause loop, and from picture you gave I don't see way for that to happen.

I suspect that your port went to error disabled state maybe interface is flapping, or whatever. One way to resolve err-disabled state is shut - no shut port that is error disabled.

And on the other hand I have seen loops when vlan1 (native vlan) was removed from switches, but I don't expect that is problem here.

ASKER

We've had some networking consultants working on this issue with us and are currently having our ISP look into the issue. It happened again today, which makes the 4th Tuesday morning in a row.
We did notice that VLAN1 on SW1 in the diagram above is not receiving and BPDU's on VLAN1 which is what prompted the consultants to have me check with our ISP for any issues.

If it's not receiving BPDU's on VLAN 1 that means that it's not receiving any STP updates or anything right?

Does anyone have any other things I can check for while I'm waiting to hear back from the ISP.

Port 1 (GigabitEthernet0/1) of VLAN0001 is designated forwarding
   Port path cost 4, Port priority 128, Port Identifier 128.1.
   Designated root has priority 32769, address 001a.2ff1.6d80
   Designated bridge has priority 32769, address 001c.b0d6.2a00
   Designated port id is 128.1, designated path cost 38
   Timers: message age 0, forward delay 0, hold 0
   Number of transitions to forwarding state: 2
   Link type is point-to-point by default
   BPDU: sent 88987, received 0

Open in new window

ASKER

The switch at our ISP is just another switch in our network, nothing special, so we should be getting BPDU's from it. I should've explained that better. So in that diagram when it says it's going to the ISP, it's really just going to their building, not to one of their switches.
All of our sites are connected with direct fiber, they just meet at our ISP to the switch that we have set up down there and then go out to our sites from there for our LAN.

We have a root bridge set, which is our core switch at our datacenter. The problem is that this site that keeps going down is not receiving that update because it's not getting traffic it needs on VLAN 1 so it still has its own root bridge assigned and it shouldn't.

And there's been a switch added since the first diagram, in between SW2 and SW3, that is the switch that is being listed as the root bridge for VLAN 1.

They are connected that way because they are daisy chained that way due to lack of SFP connectors to uplink them all to Gi Ports. This is a new site so we don't have everything we need yet, we just needed to get it up and running, which is not important to this issue anyway as it will just auto negotiate at 100.

ASKER CERTIFIED SOLUTION

pgolding00

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.