Solved

random power down / Tried everything I can think of

Posted on 2003-11-19
16
338 Views
Last Modified: 2010-04-26
Hey,
I have never had this problem before in 15 years... and just can't finger it out...

I have a server I built that powers down at random intervels.

This is NOT I repeat NOT an OS issue.
This occurs in the BIOS setup, at a DOS prompt, inside win2k, etc.

Here's what I have done;
1. replaced MB
2. replaced power supply [two different ones]
3. tried with all combinations of 1/2 cpu's in both sockets
4. tried with all combinations of DIMMS 1,2,3,4 in all slots [possible anyway]
5. setup with the very BARE MINIMUM of components to start [MB+PS+1 DIMM + 1 CPU] and NOTHING else.
6. Flashed all of last three versions of BIOS

7. Monitored heat/fan speed/etc of all components via BIOS and OS Tyan hardware monitors and found nothing O.O.O...

based on 3 & 4 I find it hard to beleive that BOTH cpu's and all four DIMMS could have a problem that could cause this...
based on 1, 2 and 5, I'm not sure what else to try

Here's what the systems has [built];
1. Tyan Tiger MP 2468 MB
2. 2x AMD Athlon MP 1800+ cpu's
3. 4x Corsair Registered ECC 512Mb DDR266 DIMMs
4. Ci Designs 2100 2U server enclosure built for dual AMD + Tyan MB
5. EMACS P2G-6460P 450W PS for Ci2100/24 pin dual Athlon MB's [Tyan approved]
6. Mylex AcceleRAID 170 PCI
7. (4) Seagate Cheetah 36Gb 10k rpm SCA U160 HD in SCA trays on backplane
8. plugging power into different sources..[diff. ups's, different outlets, panels etc.]

Any ideas????

500 points if you gimmie something I havent tried yet that works!!!!!!!!!!!!!!!

0
Comment
Question by:BrianBaley
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
  • 2
  • +2
16 Comments
 
LVL 14

Expert Comment

by:kronostm
ID: 9785154
All that comes in my mind now is a short on the case, probably the motherboard. You'll have to check carefully for that.
Have you excluded the powercord and the power switch that may short the power pins by mistake ?
0
 
LVL 32

Expert Comment

by:jhance
ID: 9786081
Have you checked your site power?  Perhaps you're AC power is not steady and the machine is shutting down when the power fails momentarily.  Try an UPS.
0
 
LVL 3

Expert Comment

by:terageek
ID: 9792303
I have seen issues where one bad component "kills" others which cause failures. Assuming you made the changes you mentioned in order, your first power supply could have fried something on both mobos making them both unstable.  You could try replacing both the power supply and the mother-board simultaneously.

If that doesn't work, you could try replacing the case.  It sounds like you tired everything else already.
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 
LVL 1

Author Comment

by:BrianBaley
ID: 9797882
more details in regards to your posts;

jhance: I have tried multpile UPS and surge protectors. The power is coming from the same rack of UPS's all other 9 servers are running off of [including swapping with the UPS outlet of known good systems] and trying other small 450VA standalones etc.

terageek: When I got the RMA'd new MB, I plugged it into my spare {unused} PS, not the old one as the PS was my very first suspicion after seeing it power down even in the BIOS, etc.

kronostm: I have done this each time I have removed the board and was carefull to make sure nothing fell down in there etc... the weird thing is that it was running for quite a time before.... at any rate, I think I will remove the board and PS from the rack case and try running it outside of the box.

I wish I had a backup case ;-) but $750 bucks is hard to come by! find out soon...

0
 
LVL 32

Expert Comment

by:jhance
ID: 9798128
Is sabotage a possibility?  Perhaps some disgruntled person is shutting this thing down, either directly or remotely just to torque you off.
0
 
LVL 3

Expert Comment

by:terageek
ID: 9798200
Not giving up yet...

I noticed above you mentioned that you built a bare system with only MB+PS+1 DIMM + 1 CPU.  Did you have a HD in that setup?  How about case fans?  Any single electrical device could have a short, draw too much power and cause a "brownout" in the system which could power it down.

Alternatively, I might think that it is a flaky power switch, but that should cause the system to randomly turn on as well as off.
0
 
LVL 1

Author Comment

by:BrianBaley
ID: 9799154
jhance: no. I have sat there and stared at the fan rpms + cpu temps and watched it shutdown before my eyes while in the BIOS.

terageek: let's, see... I have tried it with each cpu separately [and no other fans etc], in both sockets alternately [4 combinations] so both fans would have to be shorted...

as far as the brownout situation, yeah, maybe, but it would be pretty strange for it to do it both after 2 minutes OR after 5 ours... if it was a component that was shorted one would think it would take relatively the same amount of time to fail, overheat, saturate, etc.

power switch: hhhmmm... this is one of those "membrane" switches.... but it seems to act normal when in use, i.e., it seems to take exactly the same amount of pressure and distance during pressing to make contact, etc... if the physical [membrane] switch were flaky you'd think it would power on randomly too....  

I'll remove it [switch] and short the green PS wire to ground and see what happens......

0
 
LVL 14

Accepted Solution

by:
kronostm earned 500 total points
ID: 9802412
That's getting weirder dude ...
I know RAM usually doesn't cause reboots, but try replacing with some other brand than Corsair. Mybe you can borrow a kingston or something. I'm not saying you have a bad RAM stick since you swapped them but maybe an incompatibility between Corsair memory and your  mobo.
0
 
LVL 1

Author Comment

by:BrianBaley
ID: 9802750
kronostm: the server ran as is with these 4 sticks for weeks before this happened....
0
 
LVL 14

Expert Comment

by:kronostm
ID: 9803049
AARRGGHH  ! ! !
0
 
LVL 1

Author Comment

by:BrianBaley
ID: 9803217
I promise to let you guys know what I find and give you some points irregardless....

I have to finish a fresh linux install, then I'll get back to it.....
0
 

Expert Comment

by:nealbing
ID: 9808651
Here is a question for you....  In your server bios do you have options for the different power save functions?  If so have you checked them to see if maybe your hard drives are set to shutdown at a certain time etc.... This does not sound like an issue where you have bad hardware components at all.  I also have thought of external power fluctuations but do not think that would be the cause either.  I would strongly suggest you look into your bios settings. I would personally test this by setting some powersave settings when the OS is up and see if they will perform as you set them.  If they do not then it is definately a setting in your bios.
0
 
LVL 14

Expert Comment

by:kronostm
ID: 9887508
Brian ... can you provide some details too, please?
0
 
LVL 1

Author Comment

by:BrianBaley
ID: 9888550
Wish I could.... but I had to back-burner the project....

I still have yet to take out of case[rackmnt] and run without the switch [shorting power on pin at connector]...

When I get the chance I will forward info. It's the only thing it could be....
0
 
LVL 1

Author Comment

by:BrianBaley
ID: 9899668
kronstm:

well, I removed the power switch and shorted the connection and it has been running for two days with no shutdown....
Bizzare.

For reference, the case is a Ci Designs 2100. It utilizes a small "membrane" on-off switch. the switch is mounted on a small 1"x2" PCB behind the faceplate and the only other components on the board are three LED's, a small momentary [reset] switch and two resistors....

My guess is the switch was shorting....

Maybe that's why no one has used them since the T.Sinclair ;-)

0
 
LVL 14

Expert Comment

by:kronostm
ID: 9903350
Probably. So, my guess was probably right. You should have accepted my first post as answer since that led you probably to the solution. I'm saying this because this Q will be used maybe by others in the future to find solutions to similar problems, and an accepted answer is usually the one that points towards a good suggestion. Please remember this if you'll post other Qs in the future.

Good luck

Kronos

0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
API for Viber/Whatsup/etc 3 74
PC Power Supply Repair 48 178
Dell T5810 (and which processor) or Dell XPS 8910 Special Edition 6 144
Command prompt flashing when starting PC 16 106
Computer running slow? Taking forever to open a folder, documents, or any programs that you didn't have an issue with before? Here are a few steps to help speed it up. The programs mentioned below ALL have free versions, you can buy them if you w…
I use more than 1 computer in my office for various reasons. Multiple keyboards and mice take up more than just extra space, they make working a little more complicated. Using one mouse and keyboard for all of my computers makes life easier. This co…
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…
Finding and deleting duplicate (picture) files can be a time consuming task. My wife and I, our three kids and their families all share one dilemma: Managing our pictures. Between desktops, laptops, phones, tablets, and cameras; over the last decade…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question