win 2003 server standard NTBackup.exe utility freezes computer(s)

I discovered an issue that is somewhat reproducible (on 2 different
motherboards) concerning Windows 20003 Server Standard when a backup is
Basically, after anywhere from 1 to 5 minutes into the backup, the entire
system freezes.
I have reproduced it on 3 servers, 2 of which are running dual Opteron 242
CPUs w/Corsair PC3200 memory (2GB), running 2 RAID 1 sets (SATA RAID on a
SI3114 on-board controller) - a Tyan 2882 motherboard, and a single Opteron
142 CPU with Corsair PC2700 memory (1GB) - same RAID controller &
configuration on a Tyan 2850 motherboard.  All machines are running Windows
2003 Server Standard, Symantec A/V Corporate Edition 9, 2 setup as
webservers (IIS HTTP&FTP&MAIL services running) and 1 SQL 2000 box (no other
services running).

I've ruled out the network cards as a problem (each board sports both a
Broadcom and an Intel Pro/100 on-board NIC) as I have tried the on-board
NICs by utilizing an offboard NIC.
No errors are in any of the event logs, in fact there are no errors reported
at all of any  kind and, unfortunately, no "blue screen of death".
Replacing the motherboard, memory, drives and CPUs have no effect, either.

The last event before complete lock up (screen frozen, keyboard frozen,
unpingable interfaces, etc) is the following:

Event Type: Information
Event Source: Service Control Manager
Event Category: None
Event ID: 7036
Date:  10/13/2004
Time:  12:13:43 AM
User:  N/A
Computer: KPSWS1
The Microsoft Software Shadow Copy Provider service entered the running

For more information, see Help and Support Center at

From past experience, I feel this is most likely due to some sort of
hardware/driver conflict, but without a memory dump or some sort of error
log, I am at a loss as to how to fix this problem - it seems to be related
to a Shadow Copy bug, but because the machines freeze at different times
during the backup (eg usually crashes somewhere between 500MB and 4GB).
The weird thing is that sometimes it doesn't crash (it does not crash if the
RAID SETs are rebuilding - which is what happens when I have to
reset/powercycle the box - I confirmed this twice, but because I tried the
backup after rebooting, the ultimate reason for it working could also be
because the system was rebooted, thus clearing the memory - the backups are
normally done after the machine was idling for a day, which means it could
be a memory leak causing the backups to fail).

I think it may be somehow related to the SATA RAID controller by Silicon
Image as their drivers (for the 3114) were not "logo" approved and their
management software runs on Java (which could be another problem - the JVM
has a history of being problematic in my experience) so I went ahead and
ordered a RocketRAID card to see if that is the problem.

I also stress-tested the system with some off-the-shelf tools, with very
large file transfers via Ethernet (10 to 100GB in both directions
simultaneously), extreme memory & paging transfers and a CPU burn-in test
(cooking at 100%) as well as an Ethernet stress test - all at once for 6
hours straight with no errors nor lockups.
More and more it seems to point to a possible bug or conflict with "Shadow
Copy" and Silicon Image controllers (or something else these machines have
in common, Opteron CPUs perhaps).  Since commercial backup products also
utilize shadow copy, I am afraid to blow $1000 or so on another package and
end up with the same results.

Has anyone else had a problem with this?


Who is Participating?
briancassinConnect With a Mentor Commented:
Volume Shadow copies do have a problem causing freezes and interupting backups.... Try turning off volume shadow copies and see if this resolves your problem.

If any service, application etc... is accessing that server while you are trying to do the backup with volume shadow copies enabled. The volume shadow copies will not complete successfully and will get in a loop like state.
What most likely is happening is while you are trying to run the backup volume shadow copies is accessing different files you are trying to backup... Both of them intermittently end up hitting on the same file.... then the battle begins and whoever gets there first locks the other one out or they both lose either way a lock up results.
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

simplyamazingAuthor Commented:
Turned off shadow-copy and did another backup, but it locked up after about 12GB (ie complete system-wide freeze), this blows away the shadow copy theory.

I did a "C drive" to "D drive" backup with shadow copy 'on' and it did not lock up - but then this could of just been luck.  
If not luck, then maybe the network redirector is the culprit.
A drive letter z: was mapped to a network share on another standalone server for all the test backups done so far (I tried changing the share to another machine w/ running WinXP to rule out the destination machine).

I've set the system to backup to drive D every hour for the next 8 hours, if that is a success, then the network redirector is most-likely the culprit and not shadow copy.

simplyamazingAuthor Commented:
Did 10 more backups from drive to drive without a single lockup utilizing shadow copy on NTbackup.exe.

So the problem is consistently with NTbackup only when it is backing up to a network share, regardless of whether shadow copy is on or off, ruling out shadow copy as the culprit.

Changing network cards has no effect (w/diff brands), ruling out NICs/drivers as the problem.
Because local drive backups work in all cases, the SATA RAID is off the hook as the culprit.

Copying/Moving enormous files (10-100GB) manually does not cause any lockups, ruling out any throughput/network issues.

Vigorous hardware testing reveals nothing out of the ordinary (in fact, the TYAN 2882 is probably the most impressive motherboard I have seen! I fully expected a system crash with these tests, but did not get one as I usually do with other brands).

In short: NTbackup.exe on Windows 2003 Server Standard has a compatibility problem with the network redirector (or vice-versa), so I'd better buy a 3rd party backup software package.
There still, however, remains the possibility that the Southbridge motherboard drivers could be an issue as the on-board NICs are sharing the PCI bus (connected to SB).
simplyamazingAuthor Commented:
forgot to close this one out.

The problem turned out to be the onboard BroadCom Gigabit Ethernet NICs,  apparently, they have a problem with Windows 2003 Server (driver? interrupts? who knows! ).

Disabling the onboard BroadCom's and replacing with dual Intel Pro/1000 Server GB adapter fixed the problems.
It may be that the BroadCom is not completely compatible with the Tyan chipset or a badly written driver is the cause.  I sent them emails over the course of several weeks to try every possible combination of things that would cause system freezes.  After over 60 separate trial and error tests, the BroadCom chip was isolated as the sole cause of all my brain-racking problems with all the motherboards involved.

What a friggin' nightmare!  I'm afraid to buy any more system boards utilizing BroadCom chips for fear this will happen again (which, unfortunately, is just about every Opteron board maker out there)!

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.