Link to home
Start Free TrialLog in
Avatar of UnifiedIT
UnifiedIT

asked on

Server Freezing

I have a Compaq DL380 w\ SCSI raid 5 and a 1 gig NIC and a scsi adapter for tape library added. The server is running Veritas as its only application.
I did have Virtual Server 2005 installed but removed it during troubleshooting.
The server will suddenly freeze and when it does this I have to do a hard boot in order to get it back up. When it freezes nothing will respond, keyboard, mouse, no Ctrl Alt Delete, and I can not ping the box (no network connection) completely unresponsive. The event logs are clean as far as no errors at the time of shutdown. When I preform the hard reboot there is no determination on how long it will stay responsive. Sometimes it will be up for 1 week and sometimes 15 minutes. Usually no errors pop up and it is still at the Ctrl Alt Delete screen but this morning it started giving me this error.

***Hardware Malfunction***
Call you vendor for support
NMI: Parity Check / Memory Parity Error
***The Sysytem Has Halted***

At bootup I get this alert as it is running its checks.
No SCSI Bios installed
I do not know what this means

This morning I have swapped out all RAM and if it freezes I will remove the Gig card. I am guessing that this is a hardware malfunction so I am going to start swapping out hardware to pinpiont the issue.

I have no clue why this has started and not sure what I should be trying to get it resloved.
Where do I start with issues like this???
Any advice is greatly appreciated.
ASKER CERTIFIED SOLUTION
Avatar of Lee W, MVP
Lee W, MVP
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sorry, missed your comment leew...
Avatar of sciwriter
sciwriter

Many possibilities.  It could be the ram speed is too slow for the system.  Make sure the RAM in the system is at least CAS2.5 -- CAS 3 is too slow, and this MB might require CAS 2 to run right.  Check the BIOS for the RAM setting -- make sure the refresh is set to standard, not too fast.  If need be, do a "load power on defaults" in the BIOS, which will set the RAM refresh to the "safest speed".

After you have done all that, go through the other options.  Don't discount the possiblity that when windows was installed, something went into the registry wrong.  This could be something as innocuous as the ACPI power management state of the motherboard, but unless that is corrected, it will not work right, no matter what.
I have an answer for this one and you're going to love it.

I had a HP (Compaq Proliant) DL380 Generation 3 which hung when I started doing network IO.

In fact, I have three of them do the same thing. I worked it from all ends.

I called my guys. They didn't help.

I installed firmware drivers. HBA, BIOS, PERC 3i... Ran diagnostics til I was blue in the face.

Finally I called HP. They told me to call Microsoft it was an OS issue.

I explained to him it was a hardware issue. Cited emprical evidence. Told him to ship me a motherboard.

He told me could get fired if he did that. I called his bluff. He said he could at least get a demerit.

So I made him a bet. I bet him 25$ (in the form of a thinkgeek.com gift cert) the problem was the motherboard. I told him if I was right he owed me nothing.

He shipped the board, it worked, I claimed victory.

It's a manufacturing defect. Call HP. Demand a system board under warranty. I bet you 25$ that's it. <g>

No surprise, HP and Dell MBs are full of these kinds of problems.  If you are right, it saves the questioner a lot of problem, as there are about 25 other possibilities, but it sounds like a poker game to get a MB??
Eh. Welcome to the cost saving world of call centers. I got Canada when I called. I'm 90% positive. The secondary problem also indicates motherboard / bios to me. So... It's worth a shot.
Avatar of UnifiedIT

ASKER

I put 4 gig of new RAM in the server and it has bee running smooth for 8 days. Easy fix compared to a motherboard.
Thanks for all the comments
UnifiedIT
Oh well, it could have been the mobo. =) Was in my case x3.

Just goes to show you -- you can't rely on machines being brand new to be perfect.

Keep in mind the G4s just rolled out. It's been better IMHO,.
Good to hear UnifiedIT ;)

LucF