Solved

DL380 Random Reboot after Win2003 SP2 is Applied

Posted on 2009-04-07
5
628 Views
Last Modified: 2012-05-06
Almost every morning 3 of my DL380 servers reboot. This didnt start until after we applied Win 2003 SP2.  I have other DL380's that are rock solid that are still running only SP1.  There is little to no information in the Event log on the servers before the reboots happen.  (In Fact I dont think they are rebooting safely, but rather just blue screening and comming back up)  The only thing I see in the Event log is "The previous system shutdown at 5:27:59 AM on 4/7/2009 was unexpected." EventID: 6008.
I have this happening on 3 servers since the apply of SP2.  One of them has upgraded CPU and memory so i dont think this is hardware related.
Ive also tried the "Disabling the RSS and TCP Offload" workaround for the NICS that some people have suggested and that didnt resolve the problem.  Of the 3 servers all of them have the most recent Bios firmware, one is running the newest HP Network Config Utility and the other two are using NCU 7.  Two of the servers are 2.8 ghz single CPU's and ther 3rd is a 3.06Ghz.  All of the servers have 4GB of memory but only one has newly replaced memory.
All of these servers are production systems and I need this resolved asap.  I would rather not rollback Win2003 SP2 because I dont know how badly its going to screw up these boxes.  I know this is a widely known issue and that HP isnt going to give me much support.  Does anyone have any suggestions for a workaround here?  Please help.
0
Comment
Question by:Infinityinfo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 2

Expert Comment

by:potva03
ID: 24090968
Which generation of servers are these and which version of windows is installed

its always recommended to update the PSPs and Firmware before updating Service packs of OS
is it rebooting instataneously
any ASR or  post error

Try updating the PSPs and Firmware to the latest version as supported by HP

we can try updating the firmware and if that do not resolve the issue then roll back the SP2,  install the PSPs and then the drivers... dont install the latest PSP... install 8.15

0
 
LVL 4

Accepted Solution

by:
madzanta earned 500 total points
ID: 24104960
I would suggest you to begin with updating your servers using PSP (Proliant Support Pack).
When you have done this you could deactivate ASR and activate full memory dump.
Then the next time your server crashes it will generate a dump file which can be very useful
when troubleshooting BSOD's.


Upgrade server using PSP
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?lang=en&cc=us&prodTypeId=15351&prodSeriesId=316529&prodNameId=3288130&swEnvOID=1005&swLang=8&mode=2&taskId=135&swItem=MTX-e397839de9eb40508728fb40ff 

Deactivate ASR (automated system recovery) from BIOS and/or HP SMHP.

Make sure your pagefile is large enough for the dump (the ammount of ram you have +1MB)
Right click my computer -> Properties -> Advanced -> performance Settings -> Advanced -> virtual memory Change -> Enter desired value (MS recommendation (1.5 times your ram) would be fine here)

Activate full dump
Right click my computer -> Properties -> Advanced -> startup and recovery Settings -> under "Write debugging information" choose Complete memory dump

Download and install debugging tools from microsoft
http://www.microsoft.com/whdc/devtools/debugging/installx86.mspx


Now - The next time your server BSOD's it will generate a dump file called memory.dmp which will be placed in your windows directory.
So after the crash, restart your server and launch WinDbg (Windows debugging tools which you installed earlier).
now go to File -> Symbol file path and enter srv*c:\temp\symbols*http://msdl.microsoft.com/download/symbols
Then go to File -> Open crash dump and when it asks you type !analyze -v

When the analyze is done you can look for PROCESS_NAME and/or IMAGE_NAME and see what it says there.
Hopefully you will find the exe to some software or some driver you are using. Now you can start
working with the software or hardware that might be causing your problems.


Good luck and I hope it helps.
0
 
LVL 3

Expert Comment

by:SimonL-UK
ID: 24179371
There is a known issue with smart array 5 / 6 drivers used in conjunction with the Microsoft storport driver.
You need to update the smart array driver (available from HP) and the storport driver from Microsoft (http://support.microsoft.com/kb/932755)

HTH
0
 
LVL 1

Author Comment

by:Infinityinfo
ID: 24193639
Thanks so much for the help and suggestions fellas.  I am in process of trying all of these recomendations.  I have this happening on 3 of my DL380 G3 Servers and after updating the PSP it seems to be resolved but I will wait a couple more weeks before I close this thread because of what others have experienced in regard to this issue.  Some have said it some times takes several days for the servers to start rebooting again so i just want to make sure I have the issue nailed down.  Again, Thanks so much.
0
 
LVL 1

Author Closing Comment

by:Infinityinfo
ID: 31567685
Thanks so much for the help.  I believe we are out of the woods.  The servers havent rebooted in almost a month now and updating the PSP, Firmware and drivers for nearly everything on those DL380's seemed to resolve the issue.  Much appreciated.
0

Featured Post

WordPress Tutorial 4: Recommended Plugins

Now that you have WordPress installed, understand the interface, and know how to install new parts, let’s take a look at our recommended plugins.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

No single Antivirus application (despite claims by manufacturers) will catch or protect you from all Virus / Malware or Spyware threats. That doesn't stop you from further protecting yourself however - and this article is to show you how.
Windows 10 Creator Update has just been released and I have it working very well on my laptop. Read below for issues, fixes and ideas.
This is used to tweak the memory usage for your computer, it is used for servers more so than workstations but just be careful editing registry settings as it may cause irreversible results. I hold no responsibility for anything you do to the regist…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
Suggested Courses

627 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question