random power down / Tried everything I can think of

I have never had this problem before in 15 years... and just can't finger it out...

I have a server I built that powers down at random intervels.

This is NOT I repeat NOT an OS issue.
This occurs in the BIOS setup, at a DOS prompt, inside win2k, etc.

Here's what I have done;
1. replaced MB
2. replaced power supply [two different ones]
3. tried with all combinations of 1/2 cpu's in both sockets
4. tried with all combinations of DIMMS 1,2,3,4 in all slots [possible anyway]
5. setup with the very BARE MINIMUM of components to start [MB+PS+1 DIMM + 1 CPU] and NOTHING else.
6. Flashed all of last three versions of BIOS

7. Monitored heat/fan speed/etc of all components via BIOS and OS Tyan hardware monitors and found nothing O.O.O...

based on 3 & 4 I find it hard to beleive that BOTH cpu's and all four DIMMS could have a problem that could cause this...
based on 1, 2 and 5, I'm not sure what else to try

Here's what the systems has [built];
1. Tyan Tiger MP 2468 MB
2. 2x AMD Athlon MP 1800+ cpu's
3. 4x Corsair Registered ECC 512Mb DDR266 DIMMs
4. Ci Designs 2100 2U server enclosure built for dual AMD + Tyan MB
5. EMACS P2G-6460P 450W PS for Ci2100/24 pin dual Athlon MB's [Tyan approved]
6. Mylex AcceleRAID 170 PCI
7. (4) Seagate Cheetah 36Gb 10k rpm SCA U160 HD in SCA trays on backplane
8. plugging power into different sources..[diff. ups's, different outlets, panels etc.]

Any ideas????

500 points if you gimmie something I havent tried yet that works!!!!!!!!!!!!!!!

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Adrian DobrotaNetworking EngineerCommented:
All that comes in my mind now is a short on the case, probably the motherboard. You'll have to check carefully for that.
Have you excluded the powercord and the power switch that may short the power pins by mistake ?
Have you checked your site power?  Perhaps you're AC power is not steady and the machine is shutting down when the power fails momentarily.  Try an UPS.
I have seen issues where one bad component "kills" others which cause failures. Assuming you made the changes you mentioned in order, your first power supply could have fried something on both mobos making them both unstable.  You could try replacing both the power supply and the mother-board simultaneously.

If that doesn't work, you could try replacing the case.  It sounds like you tired everything else already.
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

BrianBaleyAuthor Commented:
more details in regards to your posts;

jhance: I have tried multpile UPS and surge protectors. The power is coming from the same rack of UPS's all other 9 servers are running off of [including swapping with the UPS outlet of known good systems] and trying other small 450VA standalones etc.

terageek: When I got the RMA'd new MB, I plugged it into my spare {unused} PS, not the old one as the PS was my very first suspicion after seeing it power down even in the BIOS, etc.

kronostm: I have done this each time I have removed the board and was carefull to make sure nothing fell down in there etc... the weird thing is that it was running for quite a time before.... at any rate, I think I will remove the board and PS from the rack case and try running it outside of the box.

I wish I had a backup case ;-) but $750 bucks is hard to come by! find out soon...

Is sabotage a possibility?  Perhaps some disgruntled person is shutting this thing down, either directly or remotely just to torque you off.
Not giving up yet...

I noticed above you mentioned that you built a bare system with only MB+PS+1 DIMM + 1 CPU.  Did you have a HD in that setup?  How about case fans?  Any single electrical device could have a short, draw too much power and cause a "brownout" in the system which could power it down.

Alternatively, I might think that it is a flaky power switch, but that should cause the system to randomly turn on as well as off.
BrianBaleyAuthor Commented:
jhance: no. I have sat there and stared at the fan rpms + cpu temps and watched it shutdown before my eyes while in the BIOS.

terageek: let's, see... I have tried it with each cpu separately [and no other fans etc], in both sockets alternately [4 combinations] so both fans would have to be shorted...

as far as the brownout situation, yeah, maybe, but it would be pretty strange for it to do it both after 2 minutes OR after 5 ours... if it was a component that was shorted one would think it would take relatively the same amount of time to fail, overheat, saturate, etc.

power switch: hhhmmm... this is one of those "membrane" switches.... but it seems to act normal when in use, i.e., it seems to take exactly the same amount of pressure and distance during pressing to make contact, etc... if the physical [membrane] switch were flaky you'd think it would power on randomly too....  

I'll remove it [switch] and short the green PS wire to ground and see what happens......

Adrian DobrotaNetworking EngineerCommented:
That's getting weirder dude ...
I know RAM usually doesn't cause reboots, but try replacing with some other brand than Corsair. Mybe you can borrow a kingston or something. I'm not saying you have a bad RAM stick since you swapped them but maybe an incompatibility between Corsair memory and your  mobo.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
BrianBaleyAuthor Commented:
kronostm: the server ran as is with these 4 sticks for weeks before this happened....
Adrian DobrotaNetworking EngineerCommented:
BrianBaleyAuthor Commented:
I promise to let you guys know what I find and give you some points irregardless....

I have to finish a fresh linux install, then I'll get back to it.....
Here is a question for you....  In your server bios do you have options for the different power save functions?  If so have you checked them to see if maybe your hard drives are set to shutdown at a certain time etc.... This does not sound like an issue where you have bad hardware components at all.  I also have thought of external power fluctuations but do not think that would be the cause either.  I would strongly suggest you look into your bios settings. I would personally test this by setting some powersave settings when the OS is up and see if they will perform as you set them.  If they do not then it is definately a setting in your bios.
Adrian DobrotaNetworking EngineerCommented:
Brian ... can you provide some details too, please?
BrianBaleyAuthor Commented:
Wish I could.... but I had to back-burner the project....

I still have yet to take out of case[rackmnt] and run without the switch [shorting power on pin at connector]...

When I get the chance I will forward info. It's the only thing it could be....
BrianBaleyAuthor Commented:

well, I removed the power switch and shorted the connection and it has been running for two days with no shutdown....

For reference, the case is a Ci Designs 2100. It utilizes a small "membrane" on-off switch. the switch is mounted on a small 1"x2" PCB behind the faceplate and the only other components on the board are three LED's, a small momentary [reset] switch and two resistors....

My guess is the switch was shorting....

Maybe that's why no one has used them since the T.Sinclair ;-)

Adrian DobrotaNetworking EngineerCommented:
Probably. So, my guess was probably right. You should have accepted my first post as answer since that led you probably to the solution. I'm saying this because this Q will be used maybe by others in the future to find solutions to similar problems, and an accepted answer is usually the one that points towards a good suggestion. Please remember this if you'll post other Qs in the future.

Good luck


It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.