Solved

Spanning tree issues

Posted on 2015-02-05
26
241 Views
Last Modified: 2016-11-23
Hello,

We are having issues with one of our customers. The network is designed as follows:

Two Cisco 3750 Layer 3 switches; these switches are configured as stackable - Primary & Member (provides redundancy in case of failure)

The Cisco switches route traffic between VLANs and only send traffic to firewall when internet request is made.

3 network segments (VLANs) are connected to these switches:

10
20
30

Network 10 has one cable connected to primary Cisco switch and a second cable to member Cisco switch
Network 20 has one cable connected to primary Cisco switch and a second cable to member Cisco switch
Network 30 has one cable connected to primary Cisco switch and a second cable to member Cisco switch

Each VLAN has 2 x stackable Dell Powerconnect 5548 switches

Spanning tree protocol is enabled on the Cisco switches (mode Rapid-pvst)
Spanning tree protocol is enabled on all Dell switches (mode rapid STP)

This configuration has been active for 14 months without issue.

However, during the past week there have been two instances where the 10 and  20 networks connected to the primary and  member Cisco went into block mode enabled by STP protocol. In addition network 10 and 20 lost communication to all networks , both private (internal) and public (internet).
Network 30 was still functional (could not reach network 10 or 20), but was able to connect to internet.

We originally thought that this problem was due to a device which bridged network 10 and network 20. However, we were unable to locate such a device. This happened during production hours.

We temporary fixed the problem by disabling the member Cisco switch.

6 hours later (when business was closed) we activated the member Cisco switch with the hope that we could reproduce the problem and  this time spanning tree did not block any ports, nor was any network communication lost. Logs on both the Cisco switch and Dell switches do not show any irregularities.

Questions:

1. I thought Spanning tree was supposed to avoid a situation which happened as described above--> the shutdown of two VLANs. Is this true?
2. What steps can one do to locate the problem point which caused STP to block ports ?

Any advice on how to troubleshoot this problem would be appreciated. If there is a software program or tool that can help, please recommend one.

Thanks in advance.

Mark
0
Comment
Question by:mbudman
  • 12
  • 8
  • 4
  • +2
26 Comments
 
LVL 3

Expert Comment

by:Stephen Berk
ID: 40592851
Could you add a diagram or the output of "show port-channel summary"? I'm confused on the setup.
0
 
LVL 50

Expert Comment

by:Don Johnston
ID: 40592932
As Stephen said, we really need to see a topology diagram.  Configs and output of show span command would be good too.

But I think the problem is going be the incompatibility between the Cisco proprietary STP implementation and the standards based implementation that Dell uses.

Whenever I have switches in a Cisco/other environment, I always use 802.1s spanning tree.
0
 
LVL 1

Author Comment

by:mbudman
ID: 40592945
I will post a diagram shortly, but if there is a compatibility issue between Cisco proprietary STP and Dell, why did this configuration work fine for over a year?
0
 
LVL 3

Expert Comment

by:Stephen Berk
ID: 40592952
The diagram is good, but the config and outputs are critical to verifying this is setup as you think it is. If the setup is somehting like 3 cross-stack etherchannels, you may find that some of the links never came up as expected at installation and you had some network hiccup that exposed this. Diagram = what you think is there; config = what you intend to happen; show ____ output or debug output = what is actually happening.
0
 
LVL 1

Author Comment

by:mbudman
ID: 40594175
I am a little confused. Which commands do I need to enter on the Cisco layer 3 switch and which one on the Dell power-connect layer 2 switch?

Cisco does not have an option for "show port-channel" ; neither does the dell powerconnect

Thanks.
0
 
LVL 26

Expert Comment

by:Predrag Jovic
ID: 40594202
On Cisco Catalyst 3750  that should be
#show etherchannel <channel-number> summary
0
 
LVL 50

Expert Comment

by:Don Johnston
ID: 40594237
"show ether sum" will give a list of the channel groups and their status.
0
 
LVL 28

Expert Comment

by:mikebernhardt
ID: 40594653
It could certainly be an etherchannel issue. But the other thing I would make sure of is to set the root bridge for each VLAN to the equipment which you would like to have as the root bridge. This will avoid the possibility that some transient switch which someone plugs into the network becomes the root and causes spanning tree to recalculate in a way that causes unexpected blocking.
0
 
LVL 1

Author Comment

by:mbudman
ID: 40594660
Here is a summary of the network (diagram)
network.jpg
0
 
LVL 1

Author Comment

by:mbudman
ID: 40594664
The firewalls are in active passive mode (hot standby for secondary)
0
 
LVL 50

Expert Comment

by:Don Johnston
ID: 40594669
You've got redundant paths.  Which means spanning-tree is absolutely necessary to prevent looping.

Can you post the requested output?
0
 
LVL 1

Author Comment

by:mbudman
ID: 40594679
I know Spanning tree is required. This configuration has been working without issue for the last 14 months or so. The problem only appeared this week and I am trying to determine why.
0
 
LVL 1

Author Comment

by:mbudman
ID: 40594689
Here is the requested output:

Flags:  D - down        P - bundled in port-channel
        I - stand-alone s - suspended
        H - Hot-standby (LACP only)
        R - Layer3      S - Layer2
        U - in use      f - failed to allocate aggregator

        M - not in use, minimum links not met
        u - unsuitable for bundling
        w - waiting to be aggregated
        d - default port


Number of channel-groups in use: 2
Number of aggregators:           2

Group  Port-channel  Protocol    Ports
------+-------------+-----------+-----------------------------------------------
1      Po1(SU)         LACP      Gi1/0/21(P) Gi2/0/21(P)
2      Po2(SU)         LACP      Gi1/0/22(P) Gi2/0/22(P)
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 50

Expert Comment

by:Don Johnston
ID: 40594731
The problem only appeared this week and I am trying to determine why.
Right.  But to know why, we need to have more information.  Specifically the output of the show commands for spanning tree.  If you have logging enabled, that may shed some light as well.  

Also, you topology diagram doesn't show what ports connect to what switches.

The most common reason I see for this type of thing is a cable gets connected where it shouldn't be which creates a loop that spanning-tree is unable to detect/resolve.
0
 
LVL 1

Author Comment

by:mbudman
ID: 40594748
Sorry, but what other commands should I get output for? So far I have only posted the results from Cisco. Should I repeat on the Dell switches?

The only place where employees can connect devices is on the Dell switches (there are a lot of people with laptops, Apple, and or HP).

No one has access to the Cisco switches.
0
 
LVL 50

Expert Comment

by:Don Johnston
ID: 40594788
"show span" will provide the spanning tree information.  But without knowing which ports are connected to which switch, it won't help much.
0
 
LVL 3

Expert Comment

by:Stephen Berk
ID: 40602551
Did you see the ports go into blocking mode in STP, or did you just lose routing for vlans 10 and 20? How did you verify that STP was blocking or the vlans couldn't route? Did you check fail over status on the ASA? Only 2 etherchannels are in the command output but you appear to have three in the drawing. Where's the third etherchannel?
0
 
LVL 1

Author Comment

by:mbudman
ID: 40608980
I reviewed the diagrams of what happened to our client's network. There are a few things worth noting:

1) The spanning tree protocol caused failure of the network, forcing the Cisco switches to shut down. It would appear that the Cisco switches reacted faster to the loop than the Dell switches. If the Dell switches reacted before the Cisco switches, then the ports on the Cisco switches for Vlan 10 and Vlan 20 would not have been shut down

2) When the network is stable, there already exists a loop causing spanning tree to intervene. When the network shut down the VLAN 10 and VLAN 20 on the Cisco switches, there existed two loops in the network (as there always exists one).  Can Spanning tree handle two loops?
0
 
LVL 50

Expert Comment

by:Don Johnston
ID: 40609162
The spanning tree protocol caused failure of the network, forcing the Cisco switches to shut down.
Can you elaborate on this? I've never had a network failure cause a switch to "shut down". Do you mean that the switch stopped forwarding data traffic on all ports?  Or did the switch stop responding to management input?

When the network shut down the VLAN 10 and VLAN 20 on the Cisco switches, there existed two loops in the network (as there always exists one).
Some explaination would be helpful. What do you mean by "the network shut down the VLAN 10 and VLAN 20"?

Can Spanning tree handle two loops?
Yes. There is no theoretical limit to the number of loops that spanning tree can handle.  I've had networks with as many as 20 redundant paths that spanning-tree resolved.

There are quite a few unknowns here. Without some output from the switches, there is really no way to understand what's going on in the network.
0
 
LVL 3

Expert Comment

by:Stephen Berk
ID: 40609166
Yes, spanning tree can handle many loops. When a topology change occurs, it shuts down ports and goes through a learning process to create a loop-free path through the switch fabric. It happens so fast that unless you exceed 7 daisy chained switches, its unlikely that you could see any difference between the switches. Based on your comments, it sounds like you haven't confirmed what actually happened and are only reviewing network diagrams. I highly recommend getting logs and doing a thorough review of your customers network devices to see how they are really configured and what is happening at layer 2 and 3. Best of luck.
0
 
LVL 1

Author Comment

by:mbudman
ID: 40619858
I am running some tests tonight and will have more data to post here for assistance.

Thanks.
0
 
LVL 1

Author Comment

by:mbudman
ID: 40628530
I ran some tests and was able to reproduce the problem that was initially encountered: Ports 1 & 2 on the Cisco switch turn off (port 1 has Vlan 10 connected, port 2 Vlann 2) as a result of a loop and STP trying to stop the loop.

I connected a cable on port 46 Dell switch VLAN 10 and port 46 Dell switch VLAN 20.

This cause STP to turn off (block) port l & port 2 on the Cisco switch.

The reason this happened because CISCO thought it was root bridge, and the Dell switches (each individually) thought it was the root bridge.

I contacted Dell support and they explained that there is an incompatibility between CISCO and Dell with regards to STP. The solution was to change port configuration for port 46 (Dell)  to general mode and VLAN tagging. This would have to match on the Cisco (Vlan tagging).

I would like to extend this to the entire network in case someone makes a mistakes and connects the wrong cables causing a loop.

Any suggestions as to how to do so?
0
 
LVL 50

Accepted Solution

by:
Don Johnston earned 500 total points
ID: 40628625
I contacted Dell support and they explained that there is an incompatibility between CISCO and Dell with regards to STP. The solution was to change port configuration for port 46 (Dell)  to general mode and VLAN tagging. This would have to match on the Cisco (Vlan tagging).

I can't see how that's going to solve the problem. The issue is that Dell and Cisco handle spanning tree differently on links which carry multiple VLANs. So as long as you have trunks (VLAN being tagged) on links from Dell to Cisco and you're using the default SPT, you're going to have this issue.

IMHO, a better solution would be to use 802.1s spanning tree (which is supported by both vendors).
0
 
LVL 1

Author Comment

by:mbudman
ID: 40628709
Here is a list of Cisco support (direct from Spec sheet) for Spanning tree:

●  IEEE 802.1w Rapid Spanning Tree Protocol (RSTP) provides rapid spanning-tree convergence independent of spanning-tree timers and also offers the benefit of distributed processing.
●  Stacked units behave as a single spanning-tree node.
●  Per-VLAN Rapid Spanning Tree (PVRST+) allows rapid spanning-tree reconvergence on a per-VLAN spanning-tree basis, without requiring the implementation of spanning-tree instances.

Where do I find 802.1s? What is the technical name on Cisco? Is it  MSTP?
0
 
LVL 50

Expert Comment

by:Don Johnston
ID: 40628733
Yes. MSTP. Sometimes they also use MST to describe it.
0
 
LVL 1

Author Closing Comment

by:mbudman
ID: 40702652
Thank you for your assistance.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

This article is a step by step guide on how to create a basic PTP link using Ubiquiti airOS devices. This guide can be used on the following Ubiquiti AirMAX devices. Nanostation, Bullets, AirBridge, Nanobeam, NanoBridge to name a few. Please review …
Quality of Service (QoS) options are nearly endless when it comes to networks today. This article is merely one example of how it can be handled in a hub-n-spoke design using a 3-tier configuration.
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now