Solved

Server restarting, nothing in logs, ideas?

Posted on 2007-03-26
21
378 Views
Last Modified: 2013-12-11
Hi,

I have a server that is restarting for no apparent reason with nothing in the logs to explain it.  The server is brand new with Windows SBS 2003 on it, I did a SwingMigration to it a couple of weeks back.

The server is a HP DL380 G5.  I swapped the server out with a brand new one last week (complete new server except the processor and hard drives) but the problem is still happening, random restarts once or twice a day.  It happens at different times, sometimes the middle of the night, sometimes in the morning, sometimes in the evening etc.  Server is connected to an APC UPS (which is a couple of years old), but the APC managment software shows no "power events".

I see nothing in the logs, just the usual event saying the server restarted unexpectedly.  No memory dump files, so I'm guessing it's not blue screening.  Been searching the drives for files modified around the time of the restart, nothing obvious.

The fact that it's restarting so "silently" definitely says to me a likely hardware problem, but the only original parts are the UPS, the CPU and the drives (x5 146gb SAS drives, RAID5).  It's a dual CPU server and the CPUs are brand new so I'm not too inclined to suspect the CPU but let me know if this is a bad assumption.  I don't think the drives would cause a restart like this.  That leaves the UPS or Windows itself.  Any thoughts?  I can get somebody to plug the server directly into the mains instead of into the UPS but I'd like to have other plans prepared.

So I'm just looking for feedback on what you'd try to identify the cause of the restarts, first impressions etc.  All thoughts appreciated!  Thanks!
0
Comment
Question by:Zenith63
  • 10
  • 6
  • 2
  • +2
21 Comments
 
LVL 51

Expert Comment

by:Netman66
ID: 18792722
Did you complete the server installation to make it a DC (the root DC in it's own forest)?

SBS must hold all the FSMO roles and be a GC, AND it needs to be activated.

0
 
LVL 11

Author Comment

by:Zenith63
ID: 18792817
Interesting point, never thought of that!  Would there not be something in the logs telling me about this?

I certainly went the whole way through the migration, the server holds all the FSMO roles and is a GC.  It was and is the only DC in the forest.  Maybe it didn't complete fully, any place I look for events or anything to prove this?
0
 
LVL 57

Expert Comment

by:Pete Long
ID: 18792898
HP server check the insight manager - the ASR will reboot in the event of an error :) thats on by default if you built the server from smartstart

open


https://localhost:2381/
or
https://127.0.0.1:2381/

log in

domainname\administrator
password

look at the logs :)
0
 
LVL 11

Author Comment

by:Zenith63
ID: 18792949
Hi Pete,

Yeah I came across that ASR setting earlier and disabled it so hopefully now when something goes wrong again the server will stay up and show what it was.  There's nothing in the Insight logs though, I'd have thought if the ASR was recovering the server it would log something somewhere?  Have you ever seen it in action?

Thanks!
0
 
LVL 11

Author Comment

by:Zenith63
ID: 18792959
Just read through the sbsmigration instructions again, apparently the server restart every 100 minutes and log events 1001, 1013 and 1014 if it was restarting because it didn't like being in control.  Any other thoughts NetMan?
0
 
LVL 74

Accepted Solution

by:
Jeffrey Kane - TechSoEasy earned 250 total points
ID: 18793049
What version of APC PowerChute are you running on the Server?  There's a bug in older versions... make sure it's 7.x and not 6.x.

http://nam-en.apc.com/cgi-bin/nam_en.cfg/php/enduser/std_adp.php?p_faqid=7202

Jeff
TechSoEasy

0
 
LVL 11

Author Comment

by:Zenith63
ID: 18793089
I'm using 7.0.4.
0
 
LVL 57

Expert Comment

by:Pete Long
ID: 18793133
>>Have you ever seen it in action?

No if it was the ASR insight would be logging it :)
0
 
LVL 11

Author Comment

by:Zenith63
ID: 18793205
I was thinking it would, but googling there I came across a few people having similar issues where ASR was triggered but didn't log anything so I'm gonna hope it's the same in my case!

Any more ideas?

ASR is disabled and I'm getting somebody out there in the morning to get the server off that UPS in case it has issues.
0
 
LVL 51

Expert Comment

by:Netman66
ID: 18794132
Just ensure it's activated and we can eliminate that.  Is there another SBS server on this network?  If so, is it in the same domain?  If that's true then turn it off and see what happens.

0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 11

Author Comment

by:Zenith63
ID: 18794418
Yep it's activated, has FSMO roles and is GC.  The only other server is a Windows 2003 Server member server.
0
 
LVL 51

Expert Comment

by:Netman66
ID: 18794518
Well....I tend to agree with those whom have mentioned hardware since intermittant hardware issues don't necessarily create logs where OS-related issues generally do.

You may want to shut off the server, reseat the RAM and all the procs then restart it to see if the reboot interval changes or stops completely.
0
 
LVL 11

Author Comment

by:Zenith63
ID: 18794593
This problem has been going on for a couple of weeks since I did the swing to the new HP server.  I went in late last week and replaced the HP with another brand new one we had in stock.  I just moved the drives and extra CPU across.  So I reckon I've eliminated all hardware issues like RAM, all that's left is power (eg. UPS), that second processor and the drives.  I could be wrong but I don't suspect the drives as HP Smart Array Controllers are usually execellent at reporting any errors in that department.  CPU?  Maybe, I feel it probably isn't though, with two in there I'd have though ILO would catch any unusal activity from them, but I could be very wrong?  We could pull out the processor I didn't replace and leave it on one for a few days (any problems doing this?)
0
 
LVL 51

Expert Comment

by:Netman66
ID: 18794683
How many procs were in the old server?
0
 
LVL 11

Author Comment

by:Zenith63
ID: 18794806
In the old Dell I swung from there was 1.  I swung to a DL380 with 2 processors.  I then went out last week with a server which only had one processor in it from our suppliers, so I took one processor from the server I was replacing and stuck it in this one I'd just brought out.  Hope that makes sense, not easy to explain :).  Basically the new install of SBS was installed on 2 processors and is still on 2 now.
0
 
LVL 51

Expert Comment

by:Netman66
ID: 18795026
Did you update the HAL in Device Manager for 2 procs?

Expand Computer.
Right click whatever is being shown there and select Update Driver.
If prompted to go out to the internet select No.
Hit next.
Select Install from Specific location>Next.
Select Don't Search...>Next
Select the matching HAL for Multiprocessor (ie, if it was ACPI Uniprocessor then select ACPI Multiprocessor)
0
 
LVL 11

Author Comment

by:Zenith63
ID: 18795229
I was using the Swing migration, so the new server had SBS installed from scratch on it.  It had the two processors in it the whole time.  Maybe I should just clarify what has happened so far -


1. Old Dell server in place
2. Did SBSMigration's Swing migration to a brand new dual-proc HP DL380 three weeks ago, Dell server now out of the picture (thank god!)
3.  This HP was restarting randomly (I don't think the old Dell was).
4. Went out with another brand new HP DL380 which came from the suppliers with one processor.  Plugged out live DL380.  Took the drives from the live DL380 and stuck them in this second DL380.  Also took one of the processors from the live DL380 and put it in this second DL380.
5. Plugged in this second DL380, plugged it into the network and it booted straight into SBS.
6. Left the site with the original DL380 which I was hoping had a dodgy main board or something in it.
7. Today I see the new DL380 I put in is restarting in the exact same way.

So everything was replaced in stage 4/5 except one processor, the UPS and the drives.
0
 
LVL 51

Assisted Solution

by:Netman66
Netman66 earned 250 total points
ID: 18797366
It sounds like it may be the UPS and or cable.  Try running without the signal cable to see if it stops.

0
 
LVL 74

Expert Comment

by:Jeffrey Kane - TechSoEasy
ID: 18836758
Can you run a DCDIAG /v and post the results here?  Also, it'd be helpful to see the output from a SYSTEMINFO command as well.

Thanks

Jeff
TechSoEasy
0
 
LVL 6

Expert Comment

by:snazy
ID: 18913742
Cant you monitor the temperature of your CPUs with something like speed fan to see if they are overheating or at least what temperatures they reach the most.Check your bios settings to see at what processor temperature will the computer shutdown.If the power supply(UPS) is not producing enough power for the system,it wil make it restart.Does it have enough wattage,VA?Two processors is very demanding on power.Do you have your monitor connected to the UPS?What are the specs of your UPS?
0
 
LVL 11

Author Comment

by:Zenith63
ID: 18961670
Turned out to be the UPS!  I'm kind of surprised to be honest, it wasn't that old and was showing nothing in the management software, kind of negates the point of having management software if it doesn't report errors!

Thanks for the help guys!
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

I’m often asked about newer and larger USB drives connected to SBS2008 and 2011 failing Windows Server Backup vs the older USB drives not failing. As disk space continues to grow and drive technology change SBS2008 and some SBS2011 end up with the f…
Great sound, comfort and fit, excellent build quality, versatility, compatibility. These are just some of the many reasons for choosing a headset from Sennheiser.
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now