Posted on 2009-07-07
Recently we've been having some problems with intermittent connectivity.
The network at my new employers utilizes a large flat topology, with about nine switches daisy chained together and a single /24 subnet that is very near capacity. We're using mostly HP hardware, including several end of life chassis / module design switches. The majority of our servers reside on one switch A, the users on the remainders. I'll call the most prevalent problem child switch B. Switch B is midway up the daisy chain, and switch A is on end. I can ping, ssh, rdp, etc into any server from any other server connected to switch A but some servers I cannot reach from switch B.
I tried running nmap's ping sweep to get a feel for what is going on since the switch logs are useless. The results are inconsistent. Two scans run simultaneously from switch B on different ports will return widely varying results, some times with as many as 20 hosts unaccounted for from one port to the other. Neither port on B matches up with a scan run from a host on switch A.
I remember seeing similar behavior around 5 years ago but I don't definitively remember the cause or the temporary solution we used. Long term we purchased a router, which I will do here as well. I think the problem turned out to be the MAC or connection table was getting full and the new connections trying to be established were simply dropping. Does that sound about right for the cause of this behavior? Is there anything I can do before getting my router installed a few weeks from now?