Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Intermittent loss of network on HP Proliant ML350 G6

Posted on 2013-12-03
6
Medium Priority
?
1,657 Views
Last Modified: 2014-01-03
I'm baffled by this one - not sure if it is hardware, driver or OS issue.

The server is an ML350 G6 with SBS2011 installed.  This is connected via NIC1 (no NIC teaming) to an HP ProCurve 2510G-24 switch.

This system has been running just fine for nearly 3 years without any major issues.  Suddenly in the last couple of months an intermittent issue has come up where the server will lose all network connectivity for seemingly no reason and require a reboot.  

The server is not locked up or crashed (BSOD) mind you.  You can login just fine from a console and do a graceful reboot.  During the time the ILO is still responsive and the NIC within windows shows to be online and connected.

Things I've tried/looked at:

1 - Windows Event logs - there is nothing ever reported in any of the windows logs around the time of connectivity loss other than sometimes there is a DNS resolution error (presumably because the network has dropped).  My RMM tool does log the loss of connectivity and status update failures so I can get pretty close to when the issue is happening (within 30 sec +/-)

2 - HP IML, the Integrated Management log on the iLO shows nothing - it logs the power event for the reboot and that's it

3 - Switch syslog, there are some excessive broadcasts on the network from a few of the clients that have chatty software installed, but no issues for the port that the server is plugged into (or the other port that I moved it to for testing)

4 - Windows is 100% in current patch

5 - HP SUM (System Update Manager) has been run every month to get all critical and recommended system firmware/bios/driver updates as needed so that is also current.

6 - I have scanned for rootkits/malware/viruses etc numerous times using multiple tools from Sysinternals, GMER, MBAM, SoPHOS, ESET and it always comes up clean.

I want to call HP or Microsoft but I don't even have anything to give them to start debugging.  I cannot reproduce the issue, but it has happened on 11/3, 11/8, 11/15, and 12/3
0
Comment
Question by:DigiSec
6 Comments
 
LVL 59

Accepted Solution

by:
Cliff Galiher earned 750 total points
ID: 39694137
If a power cycle is the only thing that fixes it, my first guess is failing hardware. I'd probably start by disabling the NIC and adding a new NIC. You'll need to run the Fix My Network Wizard to get all the services bound to the new NIC...so as always...have a backup. But this is a relatively trivial thing to do. If you find that solves the problem, time to call HP.
0
 

Author Comment

by:DigiSec
ID: 39694149
That's a possibility.  For that matter I could switch to the unused NIC2 and rebind everything over there.  I believe they are physically separate controllers not shared controller with 2 ports.

To be fair, a reboot is the fastest easiest way that I have been able to get my client back online - talking a non technical person through logging into and restarting the server cleanly (really wish they would pop for the Advanced iLO license).

I can't reproduce so I haven't been onsite to try things like unplugging / disabling the NIC, or using sysinternals to trace or even perfmon to to look at current network utilization.  I suppose it is possible that something is literally locking the NIC - but I don't think it's a bottleneck issue because it is not momentary - still requires a reboot.
0
 
LVL 6

Assisted Solution

by:donnk
donnk earned 750 total points
ID: 39694757
Call HP and have them diagnose the hardware.
0
Ready for your healthcare security check-up?

In the past few years, healthcare organizations have become a prime target for advanced attacks. Does your organization have what it needs to defend itself? Schedule your healthcare security check-up today and download our free Healthcare Security Resource Kit today!

 

Author Comment

by:DigiSec
ID: 39695483
Yeah, I think that's the plan for an emergency outage tonight.  Going to have HP diagnose the HW and switch over to the second NIC at the same time.  This will be next to impossible to test though since it is intermittent and not reproducible by me.

I will award partial points - both are good answers.
0
 

Expert Comment

by:BitHammer
ID: 39753742
Has HP solved the problem? I have two servers with the same issue, running two different versions of Windows Server (2003, 2008). Both Proliant DL-308, one is G3, the other G4. Because of the timing of the events, I was thinking it was due to a Microsoft update. It started after an update occurred on both machines pretty close to the same time. I suppose it could be coincidental hardware failures, but it seems unlikely. It's a huge issue as one of them is our internal DNS server and it's going down daily.
0
 

Author Comment

by:DigiSec
ID: 39755607
Interestingly enough - no.  We swapped out the motherboard to replace the NICs per HP - but had the call yesterday that the "Server was down"  I could see by the iLO that the system was up and was able to gracefully reboot via iLO - but it was inaccessible on the network.

I'm re-opening the case with HP now
0

Featured Post

Lessons on Wi-Fi & Recommendations on KRACK

Simplicity and security can be a difficult  balance for any business to tackle. Join us on December 6th for a look at your company's biggest security gap. We will also address the most recent attack, "KRACK" and provide recommendations on how to secure your Wi-Fi network today!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Hyper-convergence systems have taken the IT world by storm and have quickly started to change our point of view of how the data center should and could be architected. In this article, I’ll explain the benefits of employing a hyper-converged system …
#Citrix #Netscaler #MSSQL #Load Balance
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses

972 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question