Greetings,
Hardware: Intel 845 Chipset Motherboard, Pentium 4 2.0 GHz Processor, 1.5GB SDRAM PC133, RocketRAID 133 PCI Card, three hard drives (2 Maxtor, 1 Seagate), 400W PS
Software: Microsoft Windows Server 2003 Standard Edition with SP2 (one of two domain controllers)
Background: Every couple of days or so for the last two months (after upgrading to the current processor from a P4 1.6GHz), the system would stop responding to any keyboard input, mouse input, or network access and the screen would be blank. A simple reboot brought it back online with no errors reported in the registry. I did have software RAID 1 enabled through Windows Disk Manager so, thinking this was the issue, I broke the mirror. The problems continued. Heating does not appear to be the problem, as the ambient temperature is fairly constant at between 86-90 degrees. The processor has thermal grease, a heat sink, and fan, and there are four other fans whirring in the case. I recently added the hardware RAID card to the system and started a system rebuild (after first demoting to a member server) but that process failed with STOP error 0x00000050. The error would occur while trying to access the registry, which led to rebuilding the registry due to its corruption. I got past that error and the system was rebuilt successfully and promoted again to a domain controller. However, any "excessive" use at all of the system results in a STOP error 0x00000050. By "excessive," I mean Windows Explorer, performing updates, et cetera. If I just let the system sit with access only at the network level, it won't crash but it does lock up again. Rebooting brings it back with no errors but it locks again after a period of time or crashes with 0x50 when used interactively.
Next Steps: I have WinDbg and the symbols loaded on a separate machine and I have the minidumps available to load. My question is twofold:
- For what I am looking in the dump files?
- I am fairly certain that this is hardware-related, but where do I start?
The hard drives are clean with no errors (via Spin-Rite) and the memory passes all tests via the Microsoft Memory Diagnostics and MemTest 3.0. I have the old processor I could put back in for troubleshooting but that would be a trial-and-error approach. Related to my first question above, is there anything in the core dumps that could point me to the failing hardware component? The BIOS is set to auto-lock the CPU speed and the memory frequency is auto-locked at 133 MHz.
Any help that could be offered is very much appreciated. Take care.
Start Free Trial