I have a problem with a network backup on Windows Server 2003 (does not matter what backup sofware I use , MS backup, Veritas, etc) that, every so often, right in the middle of the backup, the computer initiating the backup (be it a "pull" or "push" backup) locks up hard. The machine siezes completely without a BSOD and is completely unresponsive (like a hardware issue would indicate). It happens every couple of backups or sometimes twice in a row, usually after transferring 1 or gigabytes of data. I've swapped out every hardware component I could, replaced NICs, cables, switches and the problem remains.
Because the problem can be reproduced on 3 identically configured machines, I think I can rule out faulty components as the culprit (2 of the machines used Tyan S2882 motherboards with dual Opterons - and 1 machine used a Tyan 2850 motherboard). Basically, I ran backups from each machine to another machine and had them all lockup after between 1 to 4 backups (tested these for almost 2 weeks to get the problem to reproduce). Incidentally, this ONLY happened on the gigabit ethernet port(s).
I suspected it was the BroadCom NIC (the only thing common to both motherboards besides the SATA controllers), so I replaced the NICs with Intel PRO Server 1000 gigabit cards. After replacement, I ran about 6 backups in a row, and the machine did not lock up.
I thought my problem was solved, so I installed the servers and that was that.
I was wrong, now, occasionally, the servers do the exact same thing (the Broadcom NICs are disabled in the BIOS, so they are completed ruled out),
about once every couple of backups the system locks up hard. (arrgghh!)
Anyway, either this problem is related to Opterons, Wind 2K3 Server drivers (all the drivers were "approved"), or Tyan mobos. Because the lockups are so "fatal", it looks like a hardware problem(from experience, hardware issues sieze machines without BSOD where the screen output is black).
I've heard in rare instances, that a driver could be the culprit.
I don't expect an answer to this problem as I've stumped all the manufacturers of this equipment: Broadcom, Tyan (denial), AMD (denial), Silicon Image (denial), etc. They blame hardware or drivers or "the other guy".
I'm just trying to find ANYONE out there who's had this same problem with unexpected lockups ONLY during a network backup running W2K3.
To eliminate external network equipment as the culprit, I hooked up a crossover cable between machines (with the same results).
I've tried different power supplies, disabling different BIOS settings, etc.
I've disabled a myriad of combinations of services in W2K3 (like Shadow Copy) with no luck.
My original "process of elimination" over a 2 week period led me to the BroadCom NIC as the culprit, but now I know that is not the case.
Since there are no errors or warnings in the Event Logs by W2K3, I'm completely in the dark as far as troubleshooting goes - (If only I could have a BSOD with a memory dump, I could pinpoint exactly what is causing this).
Right now, I would just like to eliminate Windows Server 2003 as the culprit. I can swap hardware Vendors (I avoid Tyan equipment for now, but I am still not certain they are the cause) more easily than W2K3
Anyone experiencing lockups during a gigabit network backup? If so, let me know what you are running. I have to get to the bottom of this nightmare once and for all.