Intermittent connectivity

Recently we've been having some problems with intermittent connectivity.
The network at my new employers utilizes a large flat topology, with about nine switches daisy chained together and a single /24 subnet that is very near capacity. We're using mostly HP hardware, including several end of life chassis / module design switches. The majority of our servers reside on one switch A, the users on the remainders. I'll call the most prevalent problem child switch B.  Switch B is midway up the daisy chain, and switch A is on end. I can ping, ssh, rdp, etc into any server from any other server connected to switch A but some servers I cannot reach from switch B.

I tried running nmap's ping sweep to get a feel for what is going on since the switch logs are useless. The results are inconsistent. Two scans run simultaneously from switch B on different ports will return widely varying results, some times with as many as 20 hosts unaccounted for from one port to the other. Neither port on B matches up with a scan run from a host on switch A.

I remember seeing similar behavior around 5 years ago but I don't definitively remember the cause or the temporary solution we used. Long term we purchased a router, which I will do here as well. I think the problem turned out to be the MAC or connection table was getting full and the new connections trying to be established were simply dropping. Does that sound about right for the cause of this behavior? Is there anything I can do before getting my router installed a few weeks from now?
LVL 1
timbrighamAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

jfer0x01Commented:
Hello,

the problem is that you have 9 switches daisy chained!

perhaps, it's time to invest in a larger switch, instead of many small ones, to consolidate your cabling centrally

if not, you said it yourself, replace switch b

most likely, you have a user, with a different pattern in traffic use than before, which is causing more packets to be dropped as they pass through the switches, which now results in sporadic service

try runnnig a network monitor tool, such as Wireshark, or NetMon (MS tool) to analyze the packets that are being dropped, to tie them to a source machine

Jfer

Jfer
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
regnighcCommented:
Definitly the 9 switches not helping the situation, that will cause propagation delays and will start causing errors.

I would agree with Jfer

0
timbrighamAuthor Commented:
I agree as well, hence installing a router. :)
I was hoping there was something I could do in the interim to resolve the problem before the router gets here.  

Considering the size of our organization, three of our switches - including B - are large HP units, 96 ports each. Going any larger really isn't an option.
None of my network taps are placed conveniently to monitor switch B. I've used port mirroring on routers in the past, but I'm a little leery to do so on switch that is already having problems. What kind of performance impact could I expect to receive by setting up a port mirror?
0
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

Steve JenningsIT ManagerCommented:
Agree with all . . . some poor switch is seeing a boat load of MAC addresses associated with one port and likely is puking when trying to allocate cut-through buffers for them.

Good luck,
SteveJ
0
timbrighamAuthor Commented:
I have the problem isolated.
Apparently at some point, my coworkers intentionally connected a switch A to a couple other switches in addition to B in an effort to increase speed. The network diagram didn't reflect the update so I took it on good faith the cabling was correct. Since spanning tree was also disabled on our switches we have a major layer 2 loop that needs to be broken. I'll work it into this weekend's maintenance window.  That should clear things up until I get the router installed.

Thanks all - without your direction I wouldn't have found this.
Points awarded shortly.

0
jfer0x01Commented:
Good to know you found the source

Jfer
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Networking

From novice to tech pro — start learning today.