Network problem fixed but not sure why
Posted on 2009-05-13
I recently installed a Sonicwall TZ 210. My network consists of cable modem --> TZ 210 --> Dell Powerconnect 2748 managed switch --> everything else.
I have 2 servers running, (1 is SBS2003 which is the dc, dns server, print server, and does exchange and another windows 2003 server which just does sql.) I have 2 networked printers and approx. 25 workstations running XP pro.
All these exist on a single domain with class c private ip address on a /24 subnet. The servers and the printers have static ip.
The problem I was experiencing was that the workstations that ran applications which utilized the sql server were throwing errors pertaining to a brief loss of connectivity. I did some packet sniffing on wireshark using port mirroring on my switch and didn't see anything really unusual. I starting doing ping -t to the TZ 210, my switch, both servers, my printers, and several random workstations and constantly monitoring that to find that intermittently the ping to everything would time out once or twice then normal traffic would resume except for a printer which would continue to timeout for 30 sec to a min usually. It was not one specific printer, both did it at differerent times.
The Dell switch is less than two months old so I began to suspect a jabbering nic or a bad cable. Systematically, I disconnected everything attached to my switch one thing at a time and continued to watch those ping -t's from multiple locations to isolate the cause. Considering this dropping of network packets was only happening once every 2-3 hours this was a VERY time consuming and frustrating task. I checked the data rate on the nics, etc and all configurations and found nothing.
To make a long story short, none of this fixed the problem. I then took the sql server and moved it from my switch to one of the extra ethernet ports on the TZ 210 and I noticed that the workstations running the applications using sql were no longer throwing those errors. I continued to monitor the pings as I described above and nothing had changed and I was still getting the packets dropped periodically. I then too the SBS2003 and moved it from the switch to its own ethernet port on the TZ 210 and my network problems disappeared.
I'm totally confused here. While I'm grateful the problem is fixed I want to know why. The ethernet ports the 2 servers are on and the LAN port on the TZ 210 are still the same subnet, etc and would still be the same broadcast domain right? I"m not sure what moving it to its own physical port but keeping it in the same LAN fixed.
On a side note I checked every port on the switch and had 0 collisions, 0 jabbers, 0 CRC or Align errors, etc.
Any insight into this would be appreciated.