Solved

SBS Server 2008 appears locked up

Posted on 2009-07-08
14
1,129 Views
Last Modified: 2012-05-07
I have an SBS2008 server that has been running fine since it was installed about 4 months ago. This morning, users started reporting that they could not log in. I had an on-site contact check the console and it was blank; no response to mouse or keyboard. The power lights were on the server. From a member server, I could ping the LAN NIC just fine but I could not RDP to the server. I tried to telnet to the server port 25. It appeared to conect but I received no reply from Exhange.

From a member server, I had the on-site contact type shutdown -i from the Start-->Run menu and enter the SBS Server name, select restart and click OK. It timed out with no response.

So, at this pint the server is obviously powered up and the NIC is pingable but everything else appears hung. With no other option, I had the on-site contact do a power cycle. It took over 4 hours to come back up to where I could access the server via RWW and the users could log in.

The lengthy delay before it was fully operational, I believe, is because of the use of Intel on-board RAID 5 controller. During the whole 4 hour time period, the drive lights of the array were eithre flashing rapidly or on solid. I do not have Intel Matrix installed. I just have the RAID array defined in the BIOS. This is an issue I need to adress and would be interested in any feedback from anyone who may have had a similar problem with the long startup after a hard power reset.

The main issue, though, is why the console locked up in the first place. After gaining access to the server, I looked at the logs and there were no errors prior to the power cycle restart. The power cycle restart happened at 8:28am this morning. The lst log entry in the system or applicaiotn log prior to that time was at 12:28am and it was an MSExchange Anti-virus update with the signature 3.3.7909.80. I am using Forefront for Exchange on the server.

It is unusual for there not to be any log entries between 12:28am and 8:28am when the server was power cycled. I am inclined to assume that the AV update may be the cause of the lockup.

I could use any ideas as to where to look for further diagnostic informaiton about the lockup or if anyone else has had any problems with MSExchange AV updates.

Thanks in advance for any help you can offer.

Dave

0
Comment
Question by:dcadler
  • 5
  • 3
  • 3
  • +1
14 Comments
 
LVL 22

Expert Comment

by:mutahir
Comment Utility
Well, as you cannot see anything in the logs apart from the exchange message ; I can think off :
a) Windows Update sometimes can restart so check the settings (and set them to notify you but not download and install)
b) Check your exchange antivirus settings whether it would restart automatic
c) Your server wasn't responding but was pingable or the power lights was on means it was stuck in shuting down services (probably) Exchange services take time to go down and windows OS services go down quickly so it is a good thing to first stop exchange services and then reboot/shutdown the server.
You can use a script which we use to shutdown exchange first and then reboot/restart the server :
http://www.amset.info/exchange/shutdown-script.asp
I am unable to understand what you mean by intel issue in regards to the raid ; if raid is defined properly then it should be ok ; check to see if your raid card needs firmware upgrade or your server needs a bios upgrade ;
hope that helps
0
 

Author Comment

by:dcadler
Comment Utility
I do not believe that I have the updates set to automatically restart the server. I know that up until this event, whenever updates were needed, I had an alert of pending updates on the server and I had to initiate the actual update process. However, is it normal for an AV update to cause a server restart? I would think that they wouldn't but I will check that.

Under normal circumstances, a server restart does not take a long time. This lengthy startup only occured because I had to do a forced power cycle. From what I have uncovered thus far, this startup probem seems to be related to the RAID (simply because for the 4 hours it took to come back up, the RAID array drive lights were constantly flashing or on solid, which seems like a rebuild to me). I am using a Supermicro X7DVA-E which has an integrated Intel ESB2 SATA 3.0Gbps Controller set up as RAID 5. I have Write-back caching disabled in Windows Device Manager. I do not have the Intel Matrix software installed.

There were not hardware related errors in the System event log during the 4 hour period it took for the system to come up. Just a bunch of timeout errors by services which, I assume, were caused by the drives being so busy.

Dave




0
 
LVL 22

Expert Comment

by:mutahir
Comment Utility
Try to upgrade the firmware for your raid drives and controllers
check your motherboard and raid controller manual for that and their website for any firmware upgrades.
0
 
LVL 38

Accepted Solution

by:
Philip Elder earned 500 total points
Comment Utility
If you are using Forefront, take note that the Kaspersky engine has been having some difficulties.

Disable this engine and see if that stabilizes the server.

Philip
0
 

Author Comment

by:dcadler
Comment Utility
MPECInc,

I do not have Kaspersky checked. I only have the following checked;

CA Vet
Microsoft Antimalware Engine
Norman Virus Control
Sophos Virus Detection Engine

This problem happened again on 8/31. I ended up having to do a hard power restart again and then went through the extensive server delay while the RAID array rebuilt and verified. I couldn't mound the Exchange IS until after the RAID verify was completed.

Aain, there were no indications hat I could find in the logs to explain why it locked up. I could ping the server from a workstation but I could not load the service remotely, get a remeote desktop, or get it to accept a shutdown command from a remote computer. There was no drive activity on the server when it appeared locked up.

Each time this has happened, it was overnight. It has never happened in the middle of the day.

Dave

0
 
LVL 38

Expert Comment

by:Philip Elder
Comment Utility
There you go.

If your RAID array went into rebuild mode, then there is a problem with either one of the members or the RAID controller itself.

RAID controllers have management software depending on the vendor. You can load the controller's logs and have a look at what caused the array hiccup.

Make sure your backups are good!

Philip
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 

Author Comment

by:dcadler
Comment Utility
The RAID issue, I believe, is caused by the fact that I had to do a hard shutdown without properly shutting down SBS2008, including Exchange. Do you really think the RAID issue is the source? There was no disk activity before I did the hard power reboot. Plus, everytime this has happened, it has been overnight. Not during the day when the users are actively hitting the drives.

Dave
0
 
LVL 38

Expert Comment

by:Philip Elder
Comment Utility
A hard shutdown should not force a rebuild of an array in my experience. Then, there are a lot of factors at play there.

Disk activity is essentially a non-issue relative to the problem.

If the lockup happens at the same time or thereabouts each time, then the cause would be something scheduled to run such as an A/V scan. Not have the correct exclusions in place would be a factor there.

Philip
0
 

Author Comment

by:dcadler
Comment Utility
This issue is not the RAID, there is something else that is happening after 1:00a. Most the the times it has occurred, the last log entries stop around 1:00a to 1:30a. There are never any errors in the event logs. The logs for the night before when everything worked just find look very similar.

Is there any utility that can help diagnose this issue?

Dave
0
 

Author Comment

by:dcadler
Comment Utility
It appears that the issue was related to Forefront. I was not able to locate a specific reason but removing Forefront completely solved the problem. I replaced Forefront with ORF from www.vamsoft.com which is actually doing a better job of spam filtering.
0
 
LVL 22

Expert Comment

by:mutahir
Comment Utility


b) Check your exchange antivirus settings whether it would restart  automatic

Above was one of my suggestions in troubleshooting the issue the asker was having.

No offence to anyone here but the points should be spilit between Mpecsinc and me.

Thanks

Mutahir


0
 
LVL 74

Expert Comment

by:Glen Knight
Comment Utility
Mutahir > The reason I suggested the post I did was because the author specifically stated "It appears that the issue was related to Forefront"
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

I’m often asked about newer and larger USB drives connected to SBS2008 and 2011 failing Windows Server Backup vs the older USB drives not failing. As disk space continues to grow and drive technology change SBS2008 and some SBS2011 end up with the f…
Restoring deleted objects in Active Directory has been a standard feature in Active Directory for many years, yet some admins may not know what is available.
This tutorial will walk an individual through the steps necessary to configure their installation of BackupExec 2012 to use network shared disk space. Verify that the path to the shared storage is valid and that data can be written to that location:…
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now