Solved

Dell Server rebooting unexpectedly

Posted on 2009-07-14
12
534 Views
Last Modified: 2013-12-04
I have  small company with a one year old Dell server on a 1400 W UPS that will intermittently, approximately once a month, reboot unexpectedly. There is no hint in any of the Event Viewer catagories, jut the Unexpected Reboot error as the server comes back up.

I guessing this is a hardware issue, probably server motherboard, but there are many other possibilities: Dual Power Supply +/or Controller, UPS output failing even with normal constant AC power on its input, something shorting out and dragging the power supply way down momentarily, etc. There is nothing unusual connected to the server, just LCD monitor, wired keyboard and wired optical mouse, one Ethernet, and one External USB Hard Drive for direct backup. Only the Server, Monitor and EHD are on the UPS.

I am also wondering if this could be some kind of software issue: OS, UPS software, patch, or virus, although Bit Defender has been pretty good.

Any ideas?
0
Comment
Question by:HammettG
12 Comments
 
LVL 11

Expert Comment

by:g000se
Comment Utility
Hello,

When the server reboots, does it happen on the same day each month?

Have you tried using a different power outlet and power cord?
0
 

Author Comment

by:HammettG
Comment Utility
No g000se, the server rebooted 6/18 and 7/13, different dates, days of the week and different hours 11:37am and 9:18am.
0
 
LVL 23

Expert Comment

by:ComputerTechie
Comment Utility
Have you tried uninsalling the apc software reboot and reinsall the software.

CT
0
 
LVL 13

Expert Comment

by:jaynir
Comment Utility
Hi ,
You are right HammettG, it could be the hardware issue too.

The first thing i would do to troubleshoot this issue is to disable all startup items by running msconfig utility. On the same utility i would click services, hide microsoft, then disable all non microsoft services.

restart the computer and monitor if the computer still keeps restarting. If it still does, then i would test the RAM. There's a free utility that will test the RAM which you can download it from here.. http://www.memtest.org/

If the test shows errors on RAM, i would replace the RAM.

if that also doesn't help, then i would go to start then run then type sfc /scannow

once the scan is complete, i would restart and see.

0
 
LVL 26

Expert Comment

by:lnkevin
Comment Utility
Dual Power Supply +/or Controller, UPS output failing even with normal constant AC power on its ....

- If you have dual power supplies, chance to fail PS is small including power cable. You rarely find both power cables go bad at a time.
- Patching does not release until after Tuesday of second week, so no patching issue either.
- Virus will definitely slower down your entire network and easy to recognize with constant reboot. Chance on this is small two.
- If both of your power cables connect to the same UPS, chance to have UBS battery gone bad is high. Replace your UPS battery when it is due to avoid the similar situation. Each battery has its life, you may want to check with UPS vendor to find it out.
Another chance of server being rebooted is hardware failure: memory fail, HD or controller fail, server is overheat... This case, you want to login Dell Server Open Manage and run diagnostic as well as ensuring all HW is OK.

K
0
Backup Your Microsoft Windows Server®

Backup all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 

Author Comment

by:HammettG
Comment Utility
ComputerTechie: This is a great idea! I will reinstall the software and see what happens. It maybe quite a long wait though.

jaynir: I would love to take the time and effort to troubleshoot this server hands on, but the server is the FSMO Domain Controller, connected to two other remote DC's over VPN's, for a company with 75 employees. The last thing I want to do is introduce more problems in a production environment. I will take your ideas into account if these other solutions don't work. Thanks for the suggestions.

InKevin: I also think the dual power supply failure is very unlikely, however, I would imagine there is some kind of controller which controls each power supplies output, in case one blows and would adversely affect the voltage at its output, causing over and under voltage and current situations, possibly affecting the other power supplies ability to hold its output, and therefore the systems power, at proper levels. It is this controller I would question, being faulty on its own may be the whole cause of the unexpected reboot. Whether this controller is in the motherboard or separate I don't know, but I can not imagine there NOT being a controller. If there is NOT a controller, then one of the power supplies could be bad, dragging down or surging the other supply.
My idea on patches - it could be a patch applied several weeks ago, that is just used by one service or process that is only run every so often, for system maintenance of some kind, that occurs approximately periodically, but is based more on maintenance that involves a threshold of data to be compacted or removed, etc, and this threshold then triggers the service to start, causing the faulty patch to trigger the reboot. I will check for any common event codes prior to the reboots. I agree though, I think this is a slim possibility, I would have seen something last time&
Yes, chance of known malware damage is slim, but as in my last paragraph, could it be bad code, a damaged driver, I guess these are all possibilities.
I've heard memory mentioned several times now. Normally, when I see memory issues, I think Blue Screen. But this is a hard reboot. Does a memory issue still make sense? I'm not that versed in memory issues. HDD controller - yes, total sense, the controller actually is responsible for shutting down the PC when its done saving its data to disk - I've seen this happen before. Thanks for this suggestion!
0
 
LVL 26

Expert Comment

by:lnkevin
Comment Utility
in case one blows and would adversely affect the voltage at its output, causing over and under voltage and current situations....

No, it's never. Dual PS provide you redundant power source. In fact, your server is well running with one active PS.

I think Blue Screen. But this is a hard reboot. Does a memory issue still make sense....

If you checked the option to Automatic Restart in Start up and recovery (My computer properties --> Advance) your server will reboot instead of BSOD.

I didn't see your mention on the UPS battery and this could be one of the big possibility. Also, you did not tell us that you have done checking with your Dell Server Management as well as diagnostic. If you can, post your diagnostic report after you have done so. We can review and determine whether or not it's HW related.

K
0
 

Author Comment

by:HammettG
Comment Utility
Sorry InKevin, but a statement like:

"No, it's never. Dual PS provide you redundant power source. In fact, your server is well running with one active PS."

makes no sense to an electrical engineer. Maybe you should talk to some engineers on the subject of redundant power supplies and their Controller Problems.

I will take into consideration your other suggestions.
0
 
LVL 26

Accepted Solution

by:
lnkevin earned 500 total points
Comment Utility
I am no where close to an Electrical Engineer as my major is in Computer Engineer. However, I have not yet seen the case when you have a blown PS that affected to the remainder one. If it is, what is the meaning of making redundant power supplies? I hope Dell is aware of all these before coming up with redundant PS. Anyway, your point is certainly valid as some aspects.

K
0
 

Author Comment

by:HammettG
Comment Utility
Hi Guys
0
 

Author Comment

by:HammettG
Comment Utility
Hi Guys:

The Company ho owned these servers has gone under. Unfortunately we will never know what would have solved this problem.
0

Featured Post

Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

Join & Write a Comment

Suggested Solutions

No security measures warrant 100% as a "silver bullet". The truth is we also cannot assume anything but a defensive and vigilance posture. Adopt no trust by default and reveal in assumption. Only assume anonymity or invisibility in the reverse. Safe…
Data center, now-a-days, is referred as the home of all the advanced technologies. In-fact, most of the businesses are now establishing their entire organizational structure around the IT capabilities.
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.
In this tutorial you'll learn about bandwidth monitoring with flows and packet sniffing with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're interested in additional methods for monitoring bandwidt…

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now