?
Solved

Network Seizes up for short periods of time

Posted on 2006-06-06
6
Medium Priority
?
308 Views
Last Modified: 2010-03-18
We have a small business of about 30 computers.  Almost all workstations are Windows XP.  3 Servers. (Windows 2000, Windows SBS, Windows 2003).  6 Dell Powerconnect 2624 switches. 1 Sonicwall Pro 2040.
Pretty standard setup I do believe.

Leading up to the problem:
We had a tripplite UPS go haywire and dropped all the servers and switches.  It came back on after about 20 seconds but the damage had been done.  The servers gave errors about not completing "writing to disk".  Finally after moving all the servers off the tripplite and multiple reboots the servers were back online.  On the same weekend a new SonicWall was put in place upgrading from a Pro to a 2040.

>>>>>>> The problem: <<<<<<<<<
The network now seizes up randomly.  Here is a ping done from my machine to a server:
C:\>ping -t 192.168.0.22

Pinging 192.168.0.22 with 32 bytes of data:

Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Request timed out.
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Request timed out.
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time=1263ms TTL=128
Reply from 192.168.0.22: bytes=32 time=1499ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128
Reply from 192.168.0.22: bytes=32 time=506ms TTL=128
Reply from 192.168.0.22: bytes=32 time<1ms TTL=128

Ping statistics for 192.168.0.22:
    Packets: Sent = 23, Received = 15, Lost = 8 (34% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum = 1499ms, Average = 217ms

I know this isn't an easy one to troubleshoot because there are many factors but I'd like some advice on tracking this down.  It's the middle of the week and I don't have much time to do "after hours" type of experimentation such as taking all machines offline and adding back one at time.  Eventually it'll come to that but for now what would you do to solve this problem?
0
Comment
Question by:g127404
  • 2
  • 2
  • 2
6 Comments
 
LVL 4

Expert Comment

by:tomerlei
ID: 16844027
This happens when you ping only the server or it will happen if you ping any computer on your network?
0
 
LVL 4

Author Comment

by:g127404
ID: 16844103
Good question... any computer on the network.  Even going out to the internet.

Using multiple windows and doing simultaneous pings gives different results for each.  Going to one computer on the network won't necessarily time out the same time as another one will... but eventually if given enough time it will experience a problem like any other.
0
 
LVL 51

Assisted Solution

by:Keith Alabaster
Keith Alabaster earned 1000 total points
ID: 16844367
This is likely going to require a divide and conquer approach.  ie dividing the network up in to logical segments. With a UPS going haywire as you put it, damage could have travelled trough any aspects of the system that allow conductivity. This wikll include your switches, touters and any other devices.

I would suggest that the starting point would be devices directly conected to the UPS would be the first point of call. For example, if your server is connected to the UPS, shut it down, switch it off and ping other devices. Do the results become stable? If all seems fine, you have found your issue. If not, move onto another segment etc.
0
Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

 
LVL 4

Accepted Solution

by:
tomerlei earned 1000 total points
ID: 16844446
From my past experince it happens alot with defective switches, especially smart switches.

You said you have only 6 switches in your network, try shutting one of them each time and let the other 5 work, until the problem is gone.
it could be that a defective NIC in your networks makes the problem, if thats the case you will atleast know to which switch it is connected and that will minimize your search by 5/6.

Please let me know if that helped.
0
 
LVL 4

Author Comment

by:g127404
ID: 16845338
Well, it got to the point where we had to do just that...
take down the switches and add them back one at a time. (even though I didn't want to take down the network in the middle of a work day)
Narrowing it down to one switch we found 2 ports were in a constant state of chatter.
Following one of them it looped around and plugged back into the one right next it.  It was a LOOPBACK!

Yuck. but, yes solved.  The network is now happy that it doesn't get stuck in a loop.

Thanks both for your suggestions.
0
 
LVL 51

Expert Comment

by:Keith Alabaster
ID: 16845526
Excellent, thanks :)
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Need In an Active Directory enviroment, the PDC emulator provide time synchronization for the domain. This is important since Active Directory uses Kerberos for authentication.  By default, if the time difference between systems is off by more …
This article offers some helpful and general tips for safe browsing and online shopping. It offers simple and manageable procedures that help to ensure the safety of one's personal information and the security of any devices.
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
This lesson discusses how to use a Mainform + Subforms in Microsoft Access to find and enter data for payments on orders. The sample data comes from a custom shop that builds and sells movable storage structures that are delivered to your property. …

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question