Solved

Proliant DL380p Gen8 2012 standard server randomly hangs

Posted on 2014-01-07
9
2,366 Views
Last Modified: 2014-01-19
Early last year, I installed a new proliant server to replace our DC. The new server is running server 2012 standard. I am now having an issue where the server hangs randomly for a couple of minutes and then is suddenly fine. While is hangs, users can't access SMB shares or the SpiceWorks helpdesk I have setup.

When I first set this server up, I remember this was happening but I don't remember what I tried to do to fix it. The issue seemed to go away on its own. Here I am now, several months later, and it is suddently happening again. So it either stopped and then started again or it has been happening randomly and I am just now noticing it. I could see it being easy to miss this because sometimes users may not report the issue since it hangs for a couple minutes and then is fine.

I talked with HP and they recommend I install the SPP which I am going to try tomorrow. I confirmed the BIOS is up to date.

I found these in the file service log.

BEDC01      2012      Warning      srv      System      1/7/2014 1:23:16 PM
While transmitting or receiving data, the server encountered a network error. Occasional errors are expected, but large amounts of these indicate a possible error in your network configuration.  The error status code is contained within the returned data (formatted as Words) and may point you towards the problem.

BEDC01      30623      Warning      Microsoft-Windows-SMBClient      Microsoft-Windows-SMBClient/Operational      1/7/2014 1:22:26 PM
Connection to share \servername\IPC$ was lost. Status 0xC00000B5

BEDC01      30621      Warning      Microsoft-Windows-SMBClient      Microsoft-Windows-SMBClient/Operational      1/7/2014 1:22:26 PM
Session to server \servername was lost Status 0xC00000B5

I found several of the events below around the time the users reported the issue. They are for all sorts of services. Similar events appear to be in groups together randomly.

Log Name:      System
Source:        Service Control Manager
Date:          1/7/2014 1:05:16 PM
Event ID:      7011
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:     server
Description:
A timeout (60000 milliseconds) was reached while waiting for a transaction response from the Spooler service.


I thought the backups might of been an issue since they sometimes go into the day but I just confirmed the issue happened while no backups were happening.

I appreciate any feedback on this.

Thanks,
Justin
0
Comment
Question by:JustinGSEIWI
  • 5
  • 4
9 Comments
 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 39763470
It seems from what you say that the server isn't actually hanging, more that it's lost network connectivity for a little while. This could be for any number of reasons, so here are a few to start with:


A dodgy cable or RJ-45 connector between the server and the patch panel or switch.

Try a different port on the patch panel and/or the switch.

A problem with the NIC driver; if it isn't the latest one then update it so it is.

The ToE setting for the NIC is wrong for the environment. If it's on turn it off and if it's off turn it on, and see if things improve.


These suggestions all have the twin advantages of being easy and cheap to try.
0
 

Author Comment

by:JustinGSEIWI
ID: 39763525
Thanks for the suggestions. I am going to install the SPP tomorrow night. I am also now monitoring the network connectivity to the server. The next time it hangs, I will check to see if I can ping it. I will also check the network logs I am collecting for the server connection. Using the SPP, the NIC firmware/driver will update if it is out of date. I will try your other suggestions if I notice any network issues or if the SPP does not fix the issue.

Justin
0
 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 39764556
Next time the problem occurs, log on to the server locally if you can and see if you can ping the localhost address (127.0.0.1) as well as other devices on the LAN. If the localhost ping fails then it's likely that the problem is on the server itself.
0
 

Author Comment

by:JustinGSEIWI
ID: 39765669
The issue just happened again this morning but I didn't see your post about the localhost until just now. I was able to confirm that I could ping the server during the issue. I also checked my logs and the ping/availability/packet loss was fine. However, this time the server hung so much that i had to reboot it. That is the first time I had to do that.

I am normally not on site but I am toady and tomorrow. I logged onto the server at the console while it was hanging. When I logged in, it took over my remote session and was still hanging. I tried to shut it down but it was taking longer then I could allow during production hours so I had to hard shut it down and turn it back on.

During the issue, the CPU averaged only 20%. I did find that SpiceWorks was not responding. I disabled it last night so when I came in this morning, that was not responding for some reason even though I shut it down last night. That was the only memory service I seen not responding. I e-mailed SW support to see if there is any known issue.

I am also e-mailing Trend Micro to ask if there is any kind of known issue with server 2012.

The ILO is still reporting that all the hardware is fine. I am wondering if I should still do a hardware test tonight though? If the memory was bad, the server should of reported it right away right?

Thanks,

Justin
0
6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 39766139
Have a look at the event logs to see if errors are reported at the time the problem occurs. You should be able to find them quite easily by looking for the times that they happened, and latterly when you had to give it a hard reset.
0
 

Author Comment

by:JustinGSEIWI
ID: 39766258
When the hang happens, I see several of these events.

Error      1/8/2014 10:26:57 AM      Service Control Manager      7000      None
Error      1/8/2014 10:26:57 AM      Service Control Manager      7011      None
Error      1/8/2014 10:26:57 AM      Service Control Manager      7011      None
Error      1/8/2014 10:25:57 AM      Service Control Manager      7011      None
Error      1/8/2014 10:25:57 AM      Service Control Manager      7011      None

They all are for a different service. The details are the same for each event just a different service is listed. The even is below.

A timeout (60000 milliseconds) was reached while waiting for a transaction response from the DHCPServer service.

Basically, this is just further proof the server is hanging.
0
 
LVL 15

Expert Comment

by:Perarduaadastra
ID: 39766385
This may shed a little more light:

http://support.microsoft.com/kb/922918

However, as the default timeout value seems to be 30000 milliseconds, it may be that when you had this problem a while ago you increased the value to 60000. I have seen a recommendation for 120000, but I can't remember which Windows server version that was for.
In any case, increasing this value may address the symptom but doesn't bring the cause any closer.

I did come across this recently, which may help:

http://www.eversity.nl/blog/2012/08/a-timeout-30000-milliseconds-was-reached-while-waiting-for-a-transaction-response-from-the-name-of-service-service/

... as it seems to resemble your situation, but only you can tell for sure.
0
 

Accepted Solution

by:
JustinGSEIWI earned 0 total points
ID: 39780551
I installed the Service Pack for Proliant and that appears to of fixed the issue. Last week it was happening each day and the issue hasn't happened since I installed all of the updates. A link is below.

http://h17007.www1.hp.com/us/en/enterprise/servers/products/service_pack/spp/index.aspx

Thanks,

Justin
0
 

Author Closing Comment

by:JustinGSEIWI
ID: 39791948
My suggested fix in my initial post appears to of fixed the issue. Thanks for the replies!
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Understanding the various editions available is vital when you decide to purchase Windows Server 2012. You need to have a basic understanding of the features and limitations in each edition in order to make a well-informed decision that best suits y…
Learn about cloud computing and its benefits for small business owners.
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This tutorial will walk an individual through the process of installing of Data Protection Manager on a server running Windows Server 2012 R2, including the prerequisites. Microsoft .Net 3.5 is required. To install this feature, go to Server Manager…

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now