Solved

SBS Server 2008 appears locked up

Posted on 2009-07-08
14
1,132 Views
Last Modified: 2012-05-07
I have an SBS2008 server that has been running fine since it was installed about 4 months ago. This morning, users started reporting that they could not log in. I had an on-site contact check the console and it was blank; no response to mouse or keyboard. The power lights were on the server. From a member server, I could ping the LAN NIC just fine but I could not RDP to the server. I tried to telnet to the server port 25. It appeared to conect but I received no reply from Exhange.

From a member server, I had the on-site contact type shutdown -i from the Start-->Run menu and enter the SBS Server name, select restart and click OK. It timed out with no response.

So, at this pint the server is obviously powered up and the NIC is pingable but everything else appears hung. With no other option, I had the on-site contact do a power cycle. It took over 4 hours to come back up to where I could access the server via RWW and the users could log in.

The lengthy delay before it was fully operational, I believe, is because of the use of Intel on-board RAID 5 controller. During the whole 4 hour time period, the drive lights of the array were eithre flashing rapidly or on solid. I do not have Intel Matrix installed. I just have the RAID array defined in the BIOS. This is an issue I need to adress and would be interested in any feedback from anyone who may have had a similar problem with the long startup after a hard power reset.

The main issue, though, is why the console locked up in the first place. After gaining access to the server, I looked at the logs and there were no errors prior to the power cycle restart. The power cycle restart happened at 8:28am this morning. The lst log entry in the system or applicaiotn log prior to that time was at 12:28am and it was an MSExchange Anti-virus update with the signature 3.3.7909.80. I am using Forefront for Exchange on the server.

It is unusual for there not to be any log entries between 12:28am and 8:28am when the server was power cycled. I am inclined to assume that the AV update may be the cause of the lockup.

I could use any ideas as to where to look for further diagnostic informaiton about the lockup or if anyone else has had any problems with MSExchange AV updates.

Thanks in advance for any help you can offer.

Dave

0
Comment
Question by:dcadler
  • 5
  • 3
  • 3
  • +1
14 Comments
 
LVL 22

Expert Comment

by:Syed Mutahir Ali
ID: 24808268
Well, as you cannot see anything in the logs apart from the exchange message ; I can think off :
a) Windows Update sometimes can restart so check the settings (and set them to notify you but not download and install)
b) Check your exchange antivirus settings whether it would restart automatic
c) Your server wasn't responding but was pingable or the power lights was on means it was stuck in shuting down services (probably) Exchange services take time to go down and windows OS services go down quickly so it is a good thing to first stop exchange services and then reboot/shutdown the server.
You can use a script which we use to shutdown exchange first and then reboot/restart the server :
http://www.amset.info/exchange/shutdown-script.asp
I am unable to understand what you mean by intel issue in regards to the raid ; if raid is defined properly then it should be ok ; check to see if your raid card needs firmware upgrade or your server needs a bios upgrade ;
hope that helps
0
 

Author Comment

by:dcadler
ID: 24808728
I do not believe that I have the updates set to automatically restart the server. I know that up until this event, whenever updates were needed, I had an alert of pending updates on the server and I had to initiate the actual update process. However, is it normal for an AV update to cause a server restart? I would think that they wouldn't but I will check that.

Under normal circumstances, a server restart does not take a long time. This lengthy startup only occured because I had to do a forced power cycle. From what I have uncovered thus far, this startup probem seems to be related to the RAID (simply because for the 4 hours it took to come back up, the RAID array drive lights were constantly flashing or on solid, which seems like a rebuild to me). I am using a Supermicro X7DVA-E which has an integrated Intel ESB2 SATA 3.0Gbps Controller set up as RAID 5. I have Write-back caching disabled in Windows Device Manager. I do not have the Intel Matrix software installed.

There were not hardware related errors in the System event log during the 4 hour period it took for the system to come up. Just a bunch of timeout errors by services which, I assume, were caused by the drives being so busy.

Dave




0
 
LVL 22

Expert Comment

by:Syed Mutahir Ali
ID: 24860809
Try to upgrade the firmware for your raid drives and controllers
check your motherboard and raid controller manual for that and their website for any firmware upgrades.
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 38

Accepted Solution

by:
Philip Elder earned 500 total points
ID: 25283347
If you are using Forefront, take note that the Kaspersky engine has been having some difficulties.

Disable this engine and see if that stabilizes the server.

Philip
0
 

Author Comment

by:dcadler
ID: 25283473
MPECInc,

I do not have Kaspersky checked. I only have the following checked;

CA Vet
Microsoft Antimalware Engine
Norman Virus Control
Sophos Virus Detection Engine

This problem happened again on 8/31. I ended up having to do a hard power restart again and then went through the extensive server delay while the RAID array rebuilt and verified. I couldn't mound the Exchange IS until after the RAID verify was completed.

Aain, there were no indications hat I could find in the logs to explain why it locked up. I could ping the server from a workstation but I could not load the service remotely, get a remeote desktop, or get it to accept a shutdown command from a remote computer. There was no drive activity on the server when it appeared locked up.

Each time this has happened, it was overnight. It has never happened in the middle of the day.

Dave

0
 
LVL 38

Expert Comment

by:Philip Elder
ID: 25283667
There you go.

If your RAID array went into rebuild mode, then there is a problem with either one of the members or the RAID controller itself.

RAID controllers have management software depending on the vendor. You can load the controller's logs and have a look at what caused the array hiccup.

Make sure your backups are good!

Philip
0
 

Author Comment

by:dcadler
ID: 25284558
The RAID issue, I believe, is caused by the fact that I had to do a hard shutdown without properly shutting down SBS2008, including Exchange. Do you really think the RAID issue is the source? There was no disk activity before I did the hard power reboot. Plus, everytime this has happened, it has been overnight. Not during the day when the users are actively hitting the drives.

Dave
0
 
LVL 38

Expert Comment

by:Philip Elder
ID: 25284795
A hard shutdown should not force a rebuild of an array in my experience. Then, there are a lot of factors at play there.

Disk activity is essentially a non-issue relative to the problem.

If the lockup happens at the same time or thereabouts each time, then the cause would be something scheduled to run such as an A/V scan. Not have the correct exclusions in place would be a factor there.

Philip
0
 

Author Comment

by:dcadler
ID: 25428620
This issue is not the RAID, there is something else that is happening after 1:00a. Most the the times it has occurred, the last log entries stop around 1:00a to 1:30a. There are never any errors in the event logs. The logs for the night before when everything worked just find look very similar.

Is there any utility that can help diagnose this issue?

Dave
0
 

Author Comment

by:dcadler
ID: 25596722
It appears that the issue was related to Forefront. I was not able to locate a specific reason but removing Forefront completely solved the problem. I replaced Forefront with ORF from www.vamsoft.com which is actually doing a better job of spam filtering.
0
 
LVL 22

Expert Comment

by:Syed Mutahir Ali
ID: 26599896


b) Check your exchange antivirus settings whether it would restart  automatic

Above was one of my suggestions in troubleshooting the issue the asker was having.

No offence to anyone here but the points should be spilit between Mpecsinc and me.

Thanks

Mutahir


0
 
LVL 74

Expert Comment

by:Glen Knight
ID: 26599908
Mutahir > The reason I suggested the post I did was because the author specifically stated "It appears that the issue was related to Forefront"
0

Featured Post

Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

OfficeMate Freezes on login or does not load after login credentials are input.
Possible fixes for Windows 7 and Windows Server 2008 updating problem. Solutions mentioned are from Microsoft themselves. I started a case with them from our Microsoft Silver Partner option to open a case and get direct support from Microsoft. If s…
This tutorial will walk an individual through configuring a drive on a Windows Server 2008 to perform shadow copies in order to quickly recover deleted files and folders. Click on Start and then select Computer to view the available drives on the se…
This Micro Tutorial hows how you can integrate  Mac OSX to a Windows Active Directory Domain. Apple has made it easy to allow users to bind their macs to a windows domain with relative ease. The following video show how to bind OSX Mavericks to …

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question