Server Shuts down unexpectedly - no alerts in event log

AutomatedIT
AutomatedIT used Ask the Experts™
on
We have a Dell T300 Server with Windows 2003 x64 and Active Directory/Exchange 2007 installed.

Everything has worked fine since we installed the server several months ago but recently it has been shutting down without any apparent reason as to why.

We have the Dell Server admin tools installed and they do not show any alters.  Also, there is nothing in the event log about a shut down.

Any ideas on how I can troubleshoot this one?  I am out of ideas after the event logs and Dell tools.

Thanks!
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Commented:
Lots of things can cause a system shut down. Question is, does it reboot to the logon screen, or just shut down.

If just shutting down, without any reason, it might be a heat issue. I have seen heat on the CPU shut down the server without event logs.

If shutting down and rebooting, it is probably an update that is slated to shut down after installing the update. look again in event logs for WSUS information that sates a shutdown process after an update has been performed.

I have also seen it when upgrading to the latest bios version stopped this issue.

Another thing you can try is to prevent the system from shutting down in the event of an error and make it give you a BSOD. To do so, right click on "My computer" and select "properties", now go to system protection, (I believe) and prevent it from shutting down on failure.
Commented:
Does it do a clean shutdown or does it blue screen?
If it blue screens have you updated any drivers lately? Also have you checked the dump logs?
Does it reboot or just shutdown? When you go to log back into it does it say it shutdown unexpectedly?
PowerEdgeTechIT Consultant
Top Expert 2010
Commented:
Not sure what you mean by "alters" ... errors?  Did you install OpenManage Server Administrator (OMSA) and check the Hardware/ESM Log?  Is the light on the front of the system amber or blue?  You say "shut down" ... does it turn completely off, or does it reboot?
Acronis in Gartner 2019 MQ for datacenter backup

It is an honor to be featured in Gartner 2019 Magic Quadrant for Datacenter Backup and Recovery Solutions. Gartner’s MQ sets a high standard and earning a place on their grid is a great affirmation that Acronis is delivering on our mission to protect all data, apps, and systems.

PowerEdgeTechIT Consultant
Top Expert 2010
Commented:
Also ... just to be on the safe side - bypass your UPS (or check it) to make sure it is not an external power issue.
Commented:
I think we are facing one of the following troubles:
1. Heat problem: Servers shut down on a particular temperature
2. Power Problem: check the server power supply unit for defects,, then if connected to UPS or power source check it also.
3. If it reboots, then its just some software problem, but in case of software it must log the error!
RojoshoRTCC-III Level-2 Support
Commented:
Hello AutomatedIT,

This sure does appear to be either heat related or environment related, but it is too early to say.  

?How long does the system stay up?
  . Minutes, hours, days?

Can you boot into SAFE MODE and see how long you can stay alive.  If this works, then you could be looking at a virus or corrupted application.

The questions on is the system being 'rebooted' or 'shutting down' are key in helping us know what the 'reaction' is, and will help us determine a possible cause.

One way to isolate the OS is to boot an OS from your CD.  Bart's PE CD is an actual Windows OS that you can boot from the CD.  Allow it to run for twice the time it is taking your current OS to crash.  This will at least help isolate hw from sw.  Here is Bart's URL:
            http://www.nu2.nu/pebuilder/

If the system will stay up long enough, get a copy of the System and Application Event Logs.  
The Windows System Event Code for a reboot is 6005.  Looking at the System event logs, can you see a pattern in the time between failures'?
?Do you know how long it takes for the system to shutdown once it is rebooted?
?Looking at the times of the 6005 Events, do you see any corresponding events in the Application Logs - You are looking for something that occurs near the same time as the shutdowns?

Is it possible that the system is doing an 'ASR'?
  . ASR is a force reboot when the CPU is in a 'hang' state.  If you have all of the componets needed for an ASR to occur, you should see an entry in the System Event Logs.

As noted above, configuring the system to write a memory dump (FULL would be my suggestion) will allow you to see what the Stop Code is and if needed, something that Microsoft can look at.

Hope this helps,

RojoSho
shut down cause  could be faulty motherboard,RAM,PSU.
this thing doesnt store in log .
first you need to check all hardware component , and perform burn-in test. using UBCD
http://www.ultimatebootcd.com/

Author

Commented:
When it reboots it comes back up to the log on screen.

The problem went away for about 5 or 6 days and has now returned as of Sunday.  

Working through the suggestions.....

Author

Commented:
It turned out to be a bad power supply.  It has run well for a week since we changed it out.

Thank you for the help.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial