Link to home
Start Free TrialLog in
Avatar of ryansinn
ryansinn

asked on

NTBackup on Windows SBS 2003r2 Server Hangs intermittently

I've recently installed a Windows Server 2003r2 SBS Edition on an HP ML150 Server with an E200 add-on RAID card.  I patched the server until everything was up to date (as of February 7th, 2009.)

The following weekend I updated to the latest tape and raid drivers without issue.

Two weeks ago (on Tuesday, Feb 9th) I received a call in the early morning that the users could not log in... or get email... or access the internet (DNS.)

I RDP'd into the Terminal Server and could ping the server, but it was not responding to RDP requests.  I went on site and saw that the console for the server was stuck on the gray background of the login screen (the server had been logged in and locked.)

I restarted the server and it came up slowly but after letting it sit for 2 hrs it was still "Applying Computer Settings" -- I restarted the server into Safe Mode and uninstalled the updated RAID and Tape Drive... thinking that maybe a bad driver had caused the system to crash hard.

The thing I noted was that the server froze at 11:22pm (backups start at 10pm and take 3hrs.)

So I thought maybe the system kicked off the backup and ran into an error.

Anyway, I restarted into Safe Mode and disabled all services and was able to then restart and log into the system.  I started enabling services one by one until everything that could start would.  DHCP and IIS were still having issues started.  I spent about 3hrs playing around with trying to get the services started... reregistering the ocx file (with regsvr32, I can't remember the specific DLL, but it's documented in the DHCP recovery documention from Microsoft.)

Anyway -- with seemingly no intervention on my part (I know it sounds absurd) at about 3pm after spending about 7hrs on the issue the services finally started and the system started performing like normal... I hadn't restarted it since 12:30p -- so I'm not sure why it started working suddenly.

Anyway -- I've restarted the server a few times since then and have been fixing issues related to updating the system to WSS 3.0 since the first hang up.

Last night for the first time since the last system hang... the server froze again.  This time at 7:30pm with the backup starting at 7pm.  The backup logs are empty (0 bytes) for both the 9th and last night (the 17th.)

I logged into the terminal server again and pinged the server, but could not RDP.  The server console was once again frozen on the grey screen.

I powered off the server and restarted it -- it came up fine with no logs after 7:30p until 8:17am when I restarted it.  No errors right before it stopped.

Upon restarting the server the following information was the first new entry stored in the System Event Log:

"The previous system shutdown at 9:06:00 PM on 2/18/2009 was unexpected."

The logs stopped at 7:30pm, but apparently the system didn't register it had hung until 9:06.

The backups when started at 7pm typically finish backup at 9:30pm and verification between 10:40p and 11:00pm...

Tonight's backup went off without a hitch, so the behavior is not consistant.

At 7:30pm tonight the following Informational Alerts were entered into the Event Log:

"The Removable Storage service was successfully sent a start control."
"The Removable Storage service entered the running state."
"The Volume Shadow Copy service was successfully sent a start control."
"The Volume Shadow Copy service entered the running state."
"The Microsoft Software Shadow Copy Provider service was successfully sent a start control."
"The Microsoft Software Shadow Copy Provider service entered the running state."

And at 9:30pm:
"The Volume Shadow Copy service entered the stopped state."
"The Microsoft Software Shadow Copy Provider service entered the stopped state."

Then at 11:08pm tonight:
"RSM was stopped."
"The Removable Storage service entered the stopped state."

And the Backup Logs are complete and SBS says it's successful.

I've seen some VSS issues that people are mentioning applying a hotfix for, but most of them are a few years old and relate to Servers running 2003r2 SP1, not SP2.... with people mentioning the problems have been fixed in SP2.

I'm still willing to try applying redundant hotfixes if that solves the problem.

Any thoughts on this... I hope I was through enough and yet still to the point.  :)

-- UPDATE --

I've now run:
regsvr32 msxml.dll
regsvr32  msxml3.dll
regsvr32 msxml4.dll

per:


I've run:
vssadmin list writers

per:
http://www.petri.co.il/forums/showthread.php?t=25841

^^ that seems to be my issue identically

This also seems to partially be my issue (minus the actual error log / message:)
http://www.eggheadcafe.com/software/aspnet/33545710/ntbackup-failing.aspx

Tonight I'm trying:
http://support.microsoft.com/kb/940349

To see if that fixes the issue... of course we won't know for another week or two...

C:\Documents and Settings\Administrator>vssadmin list writers
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001 Microsoft Corp.
 
Writer name: 'System Writer'
   Writer Id: {e8132975-6f93-4464-a53e-1050253ae220}
   Writer Instance Id: {8b70819a-81f3-4bcd-8fa8-b90385b29523}
   State: [5] Waiting for completion
   Last error: No error
 
Writer name: 'MSDEWriter'
   Writer Id: {f8544ac1-0611-4fa5-b04b-f7ee00b03277}
   Writer Instance Id: {eb9cd8d4-55f7-49f9-9d8e-2896e49cfd84}
   State: [1] Stable
   Last error: No error
 
Writer name: 'SqlServerWriter'
   Writer Id: {a65faa63-5ea8-4ebc-9dbd-a0c4db26912a}
   Writer Instance Id: {de6d3ee3-d6a4-4e0f-97e8-3209ed3703e4}
   State: [5] Waiting for completion
   Last error: No error
 
Writer name: 'Event Log Writer'
   Writer Id: {eee8c692-67ed-4250-8d86-390603070d00}
   Writer Instance Id: {9eee7ecb-0eb5-46f2-80fe-56f59e934001}
   State: [1] Stable
   Last error: No error
 
Writer name: 'WINS Jet Writer'
   Writer Id: {f08c1483-8407-4a26-8c26-6c267a629741}
   Writer Instance Id: {ee60961c-3792-4f2e-8115-e38d16b08330}
   State: [5] Waiting for completion
   Last error: No error
 
Writer name: 'IIS Metabase Writer'
   Writer Id: {59b1f0cf-90ef-465f-9609-6ca8b2938366}
   Writer Instance Id: {017748bf-5f6f-4a0f-a5ee-20b2c4e176fd}
   State: [5] Waiting for completion
   Last error: No error
 
Writer name: 'COM+ REGDB Writer'
   Writer Id: {542da469-d3e1-473c-9f4f-7847f01fc64f}
   Writer Instance Id: {7c716e84-e6ff-47d7-8101-0ee56488a3ab}
   State: [1] Stable
   Last error: No error
 
Writer name: 'Dhcp Jet Writer'
   Writer Id: {be9ac81e-3619-421f-920f-4c6fea9e93ad}
   Writer Instance Id: {1c65cd90-37aa-40ed-9263-64f88e50ec60}
   State: [5] Waiting for completion
   Last error: No error
 
Writer name: 'Registry Writer'
   Writer Id: {afbab4a2-367d-4d15-a586-71dbb18f8485}
   Writer Instance Id: {6d254596-2668-4a92-969b-bf3cd62917be}
   State: [1] Stable
   Last error: No error
 
Writer name: 'NTDS'
   Writer Id: {b2014c9e-8711-4c5c-a5a9-3cf384484757}
   Writer Instance Id: {1e2419bd-27c5-4a6e-82ec-41893940338d}
   State: [1] Stable
   Last error: No error
 
Writer name: 'SPSearch VSS Writer'
   Writer Id: {57af97e4-4a76-4ace-a756-d11e8f0294c7}
   Writer Instance Id: {dbae9288-e9e5-4aaf-8712-62ae719be159}
   State: [5] Waiting for completion
   Last error: No error
 
Writer name: 'FRS Writer'
   Writer Id: {d76f5a28-3092-4589-ba48-2958fb88ce29}
   Writer Instance Id: {002866ec-eb21-42c9-aef5-07badb04843e}
   State: [5] Waiting for completion
   Last error: No error
 
Writer name: 'BITS Writer'
   Writer Id: {4969d978-be47-48b0-b100-f328f07ac1e0}
   Writer Instance Id: {540c2bbe-3042-4c90-848d-86f35f20a78a}
   State: [5] Waiting for completion
   Last error: No error
 
Writer name: 'WMI Writer'
   Writer Id: {a6ad56c2-b509-4e6c-bb19-49d8f43532f0}
   Writer Instance Id: {7c9f2413-6390-4900-88ed-bb26868e01d3}
   State: [5] Waiting for completion
   Last error: No error

Open in new window

Avatar of lnkevin
lnkevin
Flag of United States of America image

Most of the time, when NTbackup kicks on, the system may have another activity that overlap the time and takes up the resources. I would suggest to move the backup schedule to a few hours after 12:00 and keep monitoring it to see if the issue is still there. Also, let me know what objects that you select to backup. Common problem is people choose to backup C: with some system file actively running and NT failed to back it up. If you can, snapshot the selection with all tasks expanded and post it here.

K
Avatar of SysExpert
Since this is SBS, check what other tasks are running schedules, and also turn on the alerting option.

While you are at it run the SBS BPA ( best practices analyzer )


I hope this helps !
Avatar of ryansinn
ryansinn

ASKER

Install the recovery console and attempt to remove the virus from there:
http://support.microsoft.com/kb/216417
sorry -- wrong question :)
Best Practices only has two issues, which I'm ok with:

The Network Driver is more than a Year Old

The Update for Daylight Savings Time (DST) is not installed... it is, I've tried to rerun it and it says it's already installed.

The Scheduled Tasks look fine as well.

Which "Alerting" option are you talking about?
scheduledtasks.png
Schedule task does not look fine. You have something set to run on every hours. This one may randomly start up as the same time with your backup creating the issue in your memory. What is that task (95%)? You should check your task manager when thing start freezing to see what process is taking the CPU and memory.

K
looks fine now.  I think that 95% was the SBS Monitoring Service.  I just looked at Scheduled Tasks now... 95% is gone.
scheduledtasks.png
not sure why it grabbed the wrong screenshot... but here's the updated Scheduled Taks... no 95%
attachment
scheduledtasks.png
You get my statement properly. You need to loook in your schedule task and reorganize it. You have a lot of overlap tasks set in schedule task such as: volume shadow copy, performance data collection.... these tasks can start at the same time with the backup causing the memory insuffient issue. Add more memory to your system or organize your tasks to avoid other activities during NTbackup is running will free up memory for the backup task.

K
ASKER CERTIFIED SOLUTION
Avatar of ryansinn
ryansinn

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial