Link to home
Start Free TrialLog in
Avatar of gwa60060
gwa60060

asked on

Packet storm causes a server to restart and be unable to boot

It was Friday and had no client visits scheduled so I could get caught up with office work – unfortunately it was not to be….

Client’s network administrator  called at 0830 saying his entire network was down as well as Internet and that it all started when he heard a server restart. (DL360G8 Server 2012 R2). Some initial testing via phone involving two servers connected to the same switch resulted in them not being able to communicate via ping. A reboot of that switch(HP 2910) did not resolve the issue. With the client’s network administrator in full panic mode, I agreed to go onsite to investigate.

Here’s what I found.
•      The server in question was in a reboot loop – BIOS , try to boot into Windows, goes into Windows Recovery Mode, reboots to BIOS
•      Activity lights on all switches in the MDF (there are also 2 IDFs) are all on solid
•      Unable to ping anything from anywhere, cannot even ping the switch IPs.
•      Their other half dozen servers are all up but unable to communicate.


A quick summary of the network

Internet>Sonicwall NSA3600>HP2910 Core Switch(Default Gateway and Router)>
> IDF1 via Fiber>HP2910s>PCs, VOIP Phones Etc
>IDF2 via Fiber>HP2910s> PCs, VOIP Phones
>HP2910 (x4) > PCs, VOIP Phones Etc.


There was clearly some network packet storm going on so all uplinks from the core switch to the IDFs and other switches were removed and the core switch restarted. Activity on the core switch returned to normal and hosts connected to the core switch were able to ping. So switches and hosts were brought online until the port that caused the loop was found (a VOIP phone with 2 cables BOTH connected to live data ports).  So we turned out attention to the DL360 which had been shut down pending fixing the network loop. The server was restarted with no network connections and I fully expected to see if have the same issues as before when booting, but no, it started completely normally. A look into the logs show errors in DNS trying to connect to the LAN and domain (due to packet storm) and a restart initiated by client(!). There were two instances of this, 1, the one the network admin heard and 2, one after I did a hard restart of the system. So the question is, How can a network loop and resulting packet storm cause a server to restart and then be unable to boot into Windows? Server has 3 active connections – iLO, the server itself as a HyperV Host and one for the HV Guest (2012R2 DC).
(NOTE: Switch Configuration and STP are being addressed, the issue I’m trying to understand is about the Server OS and the packet storm)
ASKER CERTIFIED SOLUTION
Avatar of Bembi
Bembi
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of gwa60060
gwa60060

ASKER

Thanks for the comments. Hard to say what the root cause was without actually trying to recreate the event.
If you have support on the unit I would call in a ticket and troubleshoot in case this returns after support expires.  Most vendors will not cover software failures so if they can't find a hardware issue they may ask you reload the OS.

Is the environment cool and low dust ?

I would even swap out the power cord if it has been moved or reused a lot.  Test the patch cable to the switch for any wire damage or shorts.

If there were not any configuration changes at the time you are better off looking at the physical level.