Link to home
Start Free TrialLog in
Avatar of nasupport1
nasupport1Flag for United States of America

asked on

Hardware Malfunction - NMI Parity Check / Memory Parity Error

I have a Dell Precision 210 (2-PIII processors, 600mhz, 4x128MB ECC RAM) Rrunning Windows Server 2003.

Almost every day, it blue screens with the message:

Hardware Malfunction
Call you hardware vendor for support
NMI: Parity Check / Memory Parity Error

Memory passes the recommended Windows Memory Diagnostic found here: http://oca.microsoft.com/en/windiag.asp

I've replaced the memory (they are all matching sticks), run the memory tests, and I still get the blue screen.

I've replaced the box (moved drives and memory to a new Precision 210 box), run the memory tests, and I still get the blue screen.

Any ideas?
Avatar of John
John
Flag of Canada image

So you replaced the memory (and continue to use the new memory) and moved the new memory and existing hard drive to a new box (different motherboard and peripherals) and get the same error.

So it must be the operating system throwing up this error. Try downloading and installing all new drivers for this OS (video, audio, network cards, chipset and so on). ... Thinkpads_User
Avatar of cavp76
cavp76

Have you used memtest86: http://www.memtest86.com/? It has many options for configuring memory tests... definitely there's a bad module, if there's no error detected, try booting the machine taking one stick (or two, up on if it supports an odd number of RAM sticks) and seeing if it gives again the BSOD.
Might be a long shot but have you tried looking for upgraded BIOS firmware and drivers?
I only suggest it because you've already done the heavy lifting and changed motherboard/ram...
I am not sure about the overall fit for System File Checker and this situation, but in addition to drivers (my earlier post), try running (from a command prompt) SFC /SCANNOW. Let it complete and restart.

... Thinkpads_User
Avatar of nasupport1

ASKER

@thinkpads_user - It could be the OS, or an application that's causing it.  I'll explore updating drivers for the hardware, but despite it's age, it's up-to-date.  I'll try the SFC, as well.

@cavp76 - I have a copy of Memtest86, and I guess I'll try it.  It doesn't seem to be a memory issue despite the memory error.  I've systematically removed and replaced RAM, and I still get the message.  In fact, I can remove and reseat the RAM, and it will boot up.  Then, maybe a day or two later, it blue screens again.

@phoenixke - I'll try the driver updates, but the BIOS may be tough - it's an older server.
That's different... if you say one or two days and it BSODs again, I'll go for a power supply / energy problem; we're talking about machines older than 10 years, leaky capacitors could be throwing a fit, try replacing the PSU. Is this server behind a UPS?
Running Memtest-86 v3.5 now - no issues through the first 5 tests.

All capacitors look fine in both of the boxes (old and new), and it crashes on both PSUs.  I can see if I have a newer PSU to try.  It is plugged into an older APC Back-UPS Pro 420.  I'll see if swapping that out makes a difference.
Running Memtest-86 v3.5 now - no issues through the first 5 tests.

All capacitors look fine in both of the boxes (old and new), and it crashes on both PSUs.  I can see if I have a newer PSU to try.  It is plugged into an older APC Back-UPS Pro 420.  I'll see if swapping that out makes a difference.
I struggle a little bit that this is a hardware issue. Why? You replaced the memory and changed machines. You would have to have the same hardware error in both machines.

More likely, I think, is an operating system corruption.  ... Thinkpads_User
@thinkpads_user - I'm not ruling a corrupt OS out, or an application causing the error.  I started with hardware troubleshooting because of the nature of the error message: "Harware Malfunction".  But I do see what you're saying - the hard drive is a common denominator.  Event logs don't indicate anything out of the ordinary for this behavior.
Agreed... try smartmontools; it will give you all the information it can extract from the disk about S.M.A.RT. status; usually the examples are enough to get a quick view of the status of the HDD
The server has a RealTek Gigabit ethernet card that does not include Server 2003 as a supported OS, despite it installing drivers for it.  It appeared to install correctly, and has worked normally.  I have disabled it to rule it out as a culprit for this issue.  I will keep the ticket updated.
I think we gave a decent answer here (corrupt OS).  ... Thinkpads_User
ASKER CERTIFIED SOLUTION
Avatar of nasupport1
nasupport1
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I investigated the system requirements of the Realtek Gigabit NIC, and it does not support the currently installed Operating System.

Since removing the incompatible NIC, the system has remained up and functional, with no more blue screens.