Dell Server rebooting unexpectedly

I have  small company with a one year old Dell server on a 1400 W UPS that will intermittently, approximately once a month, reboot unexpectedly. There is no hint in any of the Event Viewer catagories, jut the Unexpected Reboot error as the server comes back up.

I guessing this is a hardware issue, probably server motherboard, but there are many other possibilities: Dual Power Supply +/or Controller, UPS output failing even with normal constant AC power on its input, something shorting out and dragging the power supply way down momentarily, etc. There is nothing unusual connected to the server, just LCD monitor, wired keyboard and wired optical mouse, one Ethernet, and one External USB Hard Drive for direct backup. Only the Server, Monitor and EHD are on the UPS.

I am also wondering if this could be some kind of software issue: OS, UPS software, patch, or virus, although Bit Defender has been pretty good.

Any ideas?
HammettGAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

g000seCommented:
Hello,

When the server reboots, does it happen on the same day each month?

Have you tried using a different power outlet and power cord?
0
HammettGAuthor Commented:
No g000se, the server rebooted 6/18 and 7/13, different dates, days of the week and different hours 11:37am and 9:18am.
0
ComputerTechieCommented:
Have you tried uninsalling the apc software reboot and reinsall the software.

CT
0
Powerful Yet Easy-to-Use Network Monitoring

Identify excessive bandwidth utilization or unexpected application traffic with SolarWinds Bandwidth Analyzer Pack.

jaynirCommented:
Hi ,
You are right HammettG, it could be the hardware issue too.

The first thing i would do to troubleshoot this issue is to disable all startup items by running msconfig utility. On the same utility i would click services, hide microsoft, then disable all non microsoft services.

restart the computer and monitor if the computer still keeps restarting. If it still does, then i would test the RAM. There's a free utility that will test the RAM which you can download it from here.. http://www.memtest.org/

If the test shows errors on RAM, i would replace the RAM.

if that also doesn't help, then i would go to start then run then type sfc /scannow

once the scan is complete, i would restart and see.

0
lnkevinCommented:
Dual Power Supply +/or Controller, UPS output failing even with normal constant AC power on its ....

- If you have dual power supplies, chance to fail PS is small including power cable. You rarely find both power cables go bad at a time.
- Patching does not release until after Tuesday of second week, so no patching issue either.
- Virus will definitely slower down your entire network and easy to recognize with constant reboot. Chance on this is small two.
- If both of your power cables connect to the same UPS, chance to have UBS battery gone bad is high. Replace your UPS battery when it is due to avoid the similar situation. Each battery has its life, you may want to check with UPS vendor to find it out.
Another chance of server being rebooted is hardware failure: memory fail, HD or controller fail, server is overheat... This case, you want to login Dell Server Open Manage and run diagnostic as well as ensuring all HW is OK.

K
0
HammettGAuthor Commented:
ComputerTechie: This is a great idea! I will reinstall the software and see what happens. It maybe quite a long wait though.

jaynir: I would love to take the time and effort to troubleshoot this server hands on, but the server is the FSMO Domain Controller, connected to two other remote DC's over VPN's, for a company with 75 employees. The last thing I want to do is introduce more problems in a production environment. I will take your ideas into account if these other solutions don't work. Thanks for the suggestions.

InKevin: I also think the dual power supply failure is very unlikely, however, I would imagine there is some kind of controller which controls each power supplies output, in case one blows and would adversely affect the voltage at its output, causing over and under voltage and current situations, possibly affecting the other power supplies ability to hold its output, and therefore the systems power, at proper levels. It is this controller I would question, being faulty on its own may be the whole cause of the unexpected reboot. Whether this controller is in the motherboard or separate I don't know, but I can not imagine there NOT being a controller. If there is NOT a controller, then one of the power supplies could be bad, dragging down or surging the other supply.
My idea on patches - it could be a patch applied several weeks ago, that is just used by one service or process that is only run every so often, for system maintenance of some kind, that occurs approximately periodically, but is based more on maintenance that involves a threshold of data to be compacted or removed, etc, and this threshold then triggers the service to start, causing the faulty patch to trigger the reboot. I will check for any common event codes prior to the reboots. I agree though, I think this is a slim possibility, I would have seen something last time&
Yes, chance of known malware damage is slim, but as in my last paragraph, could it be bad code, a damaged driver, I guess these are all possibilities.
I've heard memory mentioned several times now. Normally, when I see memory issues, I think Blue Screen. But this is a hard reboot. Does a memory issue still make sense? I'm not that versed in memory issues. HDD controller - yes, total sense, the controller actually is responsible for shutting down the PC when its done saving its data to disk - I've seen this happen before. Thanks for this suggestion!
0
lnkevinCommented:
in case one blows and would adversely affect the voltage at its output, causing over and under voltage and current situations....

No, it's never. Dual PS provide you redundant power source. In fact, your server is well running with one active PS.

I think Blue Screen. But this is a hard reboot. Does a memory issue still make sense....

If you checked the option to Automatic Restart in Start up and recovery (My computer properties --> Advance) your server will reboot instead of BSOD.

I didn't see your mention on the UPS battery and this could be one of the big possibility. Also, you did not tell us that you have done checking with your Dell Server Management as well as diagnostic. If you can, post your diagnostic report after you have done so. We can review and determine whether or not it's HW related.

K
0
HammettGAuthor Commented:
Sorry InKevin, but a statement like:

"No, it's never. Dual PS provide you redundant power source. In fact, your server is well running with one active PS."

makes no sense to an electrical engineer. Maybe you should talk to some engineers on the subject of redundant power supplies and their Controller Problems.

I will take into consideration your other suggestions.
0
lnkevinCommented:
I am no where close to an Electrical Engineer as my major is in Computer Engineer. However, I have not yet seen the case when you have a blown PS that affected to the remainder one. If it is, what is the meaning of making redundant power supplies? I hope Dell is aware of all these before coming up with redundant PS. Anyway, your point is certainly valid as some aspects.

K
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
HammettGAuthor Commented:
Hi Guys
0
HammettGAuthor Commented:
Hi Guys:

The Company ho owned these servers has gone under. Unfortunately we will never know what would have solved this problem.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
OS Security

From novice to tech pro — start learning today.