Solved

Spanning tree issues

Posted on 2013-10-23
14
1,608 Views
Last Modified: 2016-09-13
Hi All,

My company recently bought another company whose IT is a bit of a mess.

They've had loads of network issues, mainly broadcasts taking a while to respond (around 30 seconds).  To resolve this I replaced all their switches with 6 of these link.

All six switches are setup the same;

One core and five edge (no special settings, just how they're connected - using port 1).
default vLAN 0 is for the desktops and servers - 192.168.0.0/24
vLAN 1 for voice - 192.168.1.0/24
Port 1 - 23 can be used for both vLANs
Port 24 is setup as a trunk port and assigned for the phone.  The phone system is connected to the core, the other port 24's are empty.

I manually connected all the cables so i'm 100% sure each edge switch is only connected to the core once and is not connected to any other switch.


After I replaced the switches the spanning tree issues were minimal (i could see them in wireshark) and the network issues were resolved.

14 hours later the network issues returned and spanning tree became more prominent.

The wireshark logs show the spanning tre issues are all coming from a single MAC to a different single MAC.

I did a port scan and looked at the arp cache, but couldnt find the MAC against an IP.

The wireshark data highlights the packet as coming from an HP device and the MAC address matches a couple of the switches, but with 07 on the last two digits.

I'm guessing the issue is coming from a specific port.  Does anyone know how to get the MAC addresses assigned to each port?

Also, it looks like spanning tree kicks in after 30 seconds.  However I notice they support Rapid Spanning Tree, which kicks in after 2 seconds.  While this doesnt remove the issue, it will give me some breathing space.

I know very little about this sort of thing, so all help welcome on how to detect the port MAC addresses and enabling Rapid Spanning Tree.


Many thanks
0
Comment
Question by:detox1978
  • 6
  • 3
  • 2
  • +2
14 Comments
 
LVL 26

Assisted Solution

by:Soulja
Soulja earned 50 total points
Comment Utility
Interesting, so you have a hub and spoke topology, but are experiencing spanning tree loops and topology changes?
0
 
LVL 18

Assisted Solution

by:Akinsd
Akinsd earned 225 total points
Comment Utility
Enable loopguard or bpdu guard on your access switches.

Looks like someone is plugging a switch in their cubicle

You can also start by hardcoding the switches to access port rather that leaving them at the default of auto

eg
int range fa0/2 - 22   (assuming that's all your access ports. Make sure to exclude you uplinks
switchport mode access
0
 
LVL 50

Assisted Solution

by:Don Johnston
Don Johnston earned 225 total points
Comment Utility
default vLAN 0 is for the desktops and servers - 192.168.0.0/24 vLAN 1 for voice - 192.168.1.0/24
Is this a typo?  There is no "VLAN 0"

In your post, you mention "spanning tree issues". What do you mean by this?

The "show interface" command will display the MAC address for a port.

Also, it looks like spanning tree kicks in after 30 seconds.  However I notice they support Rapid Spanning Tree, which kicks in after 2 seconds.
If you don't have any redundant links, this isn't going to help much. These times deal with convergence when an existing link fails and the redundant link takes over.  

It sounds like you may have a rogue switch coming online that is taking over as the root and causing the entire network to re-converge.  In the higher-end switches, a feature known as bpdu-protection is supported which disables a port if a BPDU enters a configured port.  This would be configured on ports that connect to end-stations.  I can't tell if that feature is supported on your model of switch though. But you can try it.  "spanning-tree <port-list> bpdu-protection"
0
 
LVL 2

Author Comment

by:detox1978
Comment Utility
Thanks for the suggestions.

<Soulja>
Yes we have spanning tree issues on our network after I repatched everything.  Which suggestion either a faulty switch or a device (probably a hub/switch) connected to the LAN
</Soulja>

<Akinsd>
I enabled RSPT and BPDU and it stopped the edge switches from working, so had to remove it
</Akinsd>

<donjohnston>
yes it was a typo.  vLAN 1 and vLAN 2.

The spanning tree wireshark screen shot is attached.

The switches support BPDU, but when i enabled it (in conjunction with RSPT) the switches stopped working).

As you can see from the wireshark screenshot, everything appears to be coming from a single MAC address.  However I did a port scan and looked in my arp cache, but nothing matched the MAC address
</donjohnston>

Wireshark
0
 
LVL 2

Author Comment

by:detox1978
Comment Utility
Here's a screenshot from inside the packet.  They are all identical.

Inside the Spanning Tree packet
0
 
LVL 50

Accepted Solution

by:
Don Johnston earned 225 total points
Comment Utility
I don't know that I would immediately jump to a STP issue.

The BPDU's all appear to be originating from the root. They are Version 2 BPDU's (802.1w Rapid Spanning Tree).  They are being forwarded by the switch with a Bridge ID of 0x80.00-D0.7E.28.26.66.89.  The BPDU's are transmitted every 2 seconds. The cost to the root (from the capturing host) is 20.  I don't see any TC (Topology Change) BPDUs.

I don't see anything here that indicates a problem with spanning-tree.

You don't "enable" BPDUs.  BPDUs are part of the spanning-tree protocol. If you have spanning-tree running, you have BPDUs.

The originator of these BPDUs is the root bridge. I can't say for certain where the root is, but given the cost and that it's not local to the segment the packet was captured on, I would say whatever switch the one that this switch is connected to is the root.

You won't find the MAC address in any ARP cache because that MAC address is not used to source any traffic. If you do a "show interface" on the port that your wireshark PC is connected to, you will probably see that address.

One thing that is interesting is that every one is showing a bad FCS (Frame Check Sequence). But that could be a decode problem on the PC.
0
 
LVL 2

Author Comment

by:detox1978
Comment Utility
Just a quick update.  I enabled Spanning tree and BPDU and 3 of the switches became uncontactable.

I'm not on that site again until Monday, so will report back.

I have access to wireshark on that network.  So if you would like me to do any checks let me know.

D
0
Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

 
LVL 18

Assisted Solution

by:Akinsd
Akinsd earned 225 total points
Comment Utility
The uplink ports (most likely configured with portfast also) on those switches probably went into errdisable state. Bounce (shut and unshut) the port to recover it. you may want to configure errdisable recovery, if not permanently, at least during this troubleshooting phase to save you trips back and forth when the ports shutdown.

Remember to hardcode all ports that are not uplinks to access ports and turn of negotiation.

You may want to enable bpdu guard on specific ports rather than turning it on globally. When turned on globally, the feature will block every access port ot portfast enabled ports that receives bpdu.

Consider disabling portfast on your uplinks if they were configured. Portfast disables spanning tree and should only be used on uplinks if you are certain that there is no chance of loop formation.
0
 
LVL 2

Author Comment

by:detox1978
Comment Utility
Is there anything I can do to work out what's causing the issue?

I removed all the cables and switches and put new one is, so in the server room all the switches plug into a single core switch.
0
 
LVL 50

Expert Comment

by:Don Johnston
Comment Utility
Troubleshooting like this can be rather difficult. :-(

Please post the topology (showing all switches and connections between the switches) and the current configurations of the switches.
0
 
LVL 2

Author Closing Comment

by:detox1978
Comment Utility
On site today.  Looks like it's a TCPIP KeepAlive issue with the application.

Many thanks for your time.
0
 

Expert Comment

by:night crow
Comment Utility
Can you please elaborate on how its related to a TCPIP KeepAlive issue.

I am seeing the same problem.

I have a sniffer (cPacket) which is capturing all packets from a server and the sniffer is reporting CRC errors. When I look at the capture, I see the same FCS Bad: True errors that you displayed in your screenshot. From the bad packets in wireshark its seems like they are related to spanning tree.

How did you track them down? and have you got any further info that you can share.

Thanks
0
 
LVL 2

Author Comment

by:detox1978
Comment Utility
Hi nc,

The local anti virus (kasperski) managed the OS firewall,because the app (lotus notes) didn't send keep alives there was a 30 delay until the firewall would re-establish the connection.

We got rid of the antivirus as it was due for renwal.


D
0
 

Expert Comment

by:night crow
Comment Utility
Hmmm, I see.

As I mentioned in my previous post:

I have a sniffer (cPacket) which is capturing all packets from a server and the sniffer is reporting CRC errors. When I look at the capture, I see the same "FCS Bad: True" errors that you displayed in your screenshot. From the bad packets in wireshark its seems like they are related to spanning tree.

However, I am almost 100% certain that this has nothing to do with an application.

Any other ideas what could be causing this?

(As as side note, I replaced the fibres to eliminate any physical issue)
0

Featured Post

Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

Join & Write a Comment

Tired of waiting for your show or movie to load?  Are buffering issues a constant problem with your internet connection?  Check this article out to see if these simple adjustments are the solution for you.
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…
After creating this article (http://www.experts-exchange.com/articles/23699/Setup-Mikrotik-routers-with-OSPF.html), I decided to make a video (no audio) to show you how to configure the routers and run some trace routes and pings between the 7 sites…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

6 Experts available now in Live!

Get 1:1 Help Now