asked on

Spanning tree issues

Hi All,

My company recently bought another company whose IT is a bit of a mess.

They've had loads of network issues, mainly broadcasts taking a while to respond (around 30 seconds). To resolve this I replaced all their switches with 6 of these link.

All six switches are setup the same;

One core and five edge (no special settings, just how they're connected - using port 1).

default vLAN 0 is for the desktops and servers - 192.168.0.0/24
vLAN 1 for voice - 192.168.1.0/24

Port 1 - 23 can be used for both vLANs

Port 24 is setup as a trunk port and assigned for the phone. The phone system is connected to the core, the other port 24's are empty.

I manually connected all the cables so i'm 100% sure each edge switch is only connected to the core once and is not connected to any other switch.

After I replaced the switches the spanning tree issues were minimal (i could see them in wireshark) and the network issues were resolved.

14 hours later the network issues returned and spanning tree became more prominent.

The wireshark logs show the spanning tre issues are all coming from a single MAC to a different single MAC.

I did a port scan and looked at the arp cache, but couldnt find the MAC against an IP.

The wireshark data highlights the packet as coming from an HP device and the MAC address matches a couple of the switches, but with 07 on the last two digits.

I'm guessing the issue is coming from a specific port. Does anyone know how to get the MAC addresses assigned to each port?

Also, it looks like spanning tree kicks in after 30 seconds. However I notice they support Rapid Spanning Tree, which kicks in after 2 seconds. While this doesnt remove the issue, it will give me some breathing space.

I know very little about this sort of thing, so all help welcome on how to detect the port MAC addresses and enabling Rapid Spanning Tree.

Many thanks

SOLUTION

Soulja

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

David Akinsanya

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

Don Johnston

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

detox1978

ASKER

Thanks for the suggestions.

<Soulja>
Yes we have spanning tree issues on our network after I repatched everything. Which suggestion either a faulty switch or a device (probably a hub/switch) connected to the LAN
</Soulja>

<Akinsd>
I enabled RSPT and BPDU and it stopped the edge switches from working, so had to remove it
</Akinsd>

<donjohnston>
yes it was a typo. vLAN 1 and vLAN 2.

The spanning tree wireshark screen shot is attached.

The switches support BPDU, but when i enabled it (in conjunction with RSPT) the switches stopped working).

As you can see from the wireshark screenshot, everything appears to be coming from a single MAC address. However I did a port scan and looked in my arp cache, but nothing matched the MAC address
</donjohnston>

detox1978

ASKER

Here's a screenshot from inside the packet. They are all identical.

ASKER CERTIFIED SOLUTION

Don Johnston

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

detox1978

ASKER

Just a quick update. I enabled Spanning tree and BPDU and 3 of the switches became uncontactable.

I'm not on that site again until Monday, so will report back.

I have access to wireshark on that network. So if you would like me to do any checks let me know.

D

SOLUTION

David Akinsanya

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

detox1978

ASKER

Is there anything I can do to work out what's causing the issue?

I removed all the cables and switches and put new one is, so in the server room all the switches plug into a single core switch.

Don Johnston

Troubleshooting like this can be rather difficult. :-(

Please post the topology (showing all switches and connections between the switches) and the current configurations of the switches.

detox1978

ASKER

On site today. Looks like it's a TCPIP KeepAlive issue with the application.

Many thanks for your time.

night crow

Can you please elaborate on how its related to a TCPIP KeepAlive issue.

I am seeing the same problem.

I have a sniffer (cPacket) which is capturing all packets from a server and the sniffer is reporting CRC errors. When I look at the capture, I see the same FCS Bad: True errors that you displayed in your screenshot. From the bad packets in wireshark its seems like they are related to spanning tree.

How did you track them down? and have you got any further info that you can share.

Thanks

detox1978

ASKER

Hi nc,

The local anti virus (kasperski) managed the OS firewall,because the app (lotus notes) didn't send keep alives there was a 30 delay until the firewall would re-establish the connection.

We got rid of the antivirus as it was due for renwal.

D

night crow

Hmmm, I see.

As I mentioned in my previous post:

I have a sniffer (cPacket) which is capturing all packets from a server and the sniffer is reporting CRC errors. When I look at the capture, I see the same "FCS Bad: True" errors that you displayed in your screenshot. From the bad packets in wireshark its seems like they are related to spanning tree.

However, I am almost 100% certain that this has nothing to do with an application.

Any other ideas what could be causing this?

(As as side note, I replaced the fibres to eliminate any physical issue)