Intermittent PC shut off - due to CPU overheating

Hi guys,

Ive recently taken delivery of a new custom PC.  The specs are:

Case:              Lian Li 2100
Cooling:        Water-cooling of CPU, GPU and Northbridge
CPU            QuadCore Intel Core 2 Extreme QX9650  overclocked  to 3866 MHz (11 x 351)
M/Board           Asus P5K Premium/WiFi-AP (Black Pearl edition)
Chipset       Intel Bearlake P35
PSU            Hiper HPU-4M880 (880watts)
RAM            2GB Crucial Ballistix 1066Mhz DDR2 RAM (its reported as: 2 x 1 GB DDR2-800 DDR2 SDRAM  (5-5-5-18 @ 400 MHz)  (4-5-5-15 @ 333 MHz)
GFX            BFG Overclocked Nvidia Geforce 768MB 8800GTX
HDD            x2 WD Raptor 150GB SATA in a RAID0 configuration
Sound            Creative SB X-Fi Xtreme Audio Sound Card

OS            Windows Vista Home Premium
BIOS            version 0404 (American Megatrends)

Ive had one or two problems with the computer but most notably I had an issue with the CPU apparently overheating.

I turned on the computer first thing in the morning (so it had been off all night) and Id barely logged into Windows when it just shut off without warning.  I tried again and it happened before I got as far as Windows so I rebooted and quickly went into the BIOS to see what the hardware monitor was telling me.  It showed the CPU temperature as 100ºc which was rather alarming!  I presume it would've gone higher but was limited to 100 degrees (?)

I removed the sides of the case and had a look inside - not sure what I was hoping to find really but it seemed the sensible thing to do.  Anyway, after having a look around and giving a general prod of this and that (tell me if Im getting too technical for you) I decided to boot it up again to see what  would happen this time.  It then started up fine with the CPU temperature in the high 20s/low 30s.  Very odd.  It was fine for most of the day and then it did it again about 8 hours later.  That was nearly 2 days ago and hasn't re-occurred since but Im still pretty worried about it.  

I've invested in various temperature monitoring tools, namely Everest Ultimate, SensorsView and CoreTemp as I'm paranoid it'll fry.  I've run a couple of stress tests and the maximum CPU temperature was a little under 65ºc.  I've set the alarm threshold at 65 within SensorsView as I want to shut things down before they get really hot.

4 or 5 days ago I changed over the supplied PSU to the one referred to above as the original one was making an annoying high-pitched buzzing sound which was going through my head and driving me nuts!  I'd never changed a PSU before -  the limit of my experience is upgrading/installing RAM, graphics cards, sound cards etc& so my first concern was that I'd made a mistake in connecting things up.  It certainly doesn't look half as tidy inside now as I don't have any of those cable-tidy things but Im reasonably confident I've done it OK.

So, after all this waffle, here are my questions:

Having read stuff on the internet I gather CPUs can get very hot very quickly but that seemed super-fast and if it was really that hot, would it have cooled down so quickly?

Have you heard of temperature sensors having intermittent faults?

Assuming it is getting very hot, could it be a dodgy water pump perhaps?  Would the lack of pumping cause such rapid overheating?  The case felt pretty cool inside when I opened it up.

Could a faulty PSU cause this?

Any other intermittent faults you can think of?

Attached are a couple of pics of temperatures.  The first was done within about 90 minutes of gaming and the second was soon after I booted this morning.

Any other ideas or suggestions?







Everest-090208.jpg
Everest-100208.jpg
BabyFaceLimaAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
debuggerauConnect With a Mentor Commented:
Things are looking good, if you've played 90 Min of games in vista, chances are things cant be too bad.

I suspect it was your power supply that had a low 12V rail and probably stopped the fans and/or pump for the powersupply or processor or both..

If you are as paranoid as I, then set the motherboard alarms at about 50 degree C and that should be hit on a hot day with some heavy use.. If it doesnt get hit when you suspect, check the heat a drop the alarm point it a degree or so down so you can be alerted with things are getting hotter.

They all get hotter over time, with dust, time and heat. Things start to go out of spec over time and you may want to know the 'rate of decay' for your setup..

I have adjustable fans, so for me its the RPM's that vary to maintain the coolest CPU temperatures.

0
 
BabyFaceLimaAuthor Commented:
Many thanks for the prompt reply.

Is it 'normal' for PSUs to suddenly drop power?  I certainly didn't have this issue before I had this newer PSU.  

I've attached the log file which shows some voltage fluctuations.  Are they to be expected?


AlarmLog.txt
0
 
pheidiusConnect With a Mentor Commented:
Well, you are right that 100 c (that is just short of water boiling) is the auto shutdown threshold for intel.  I looked at your .txt file and the 12 volt rail was over volting not under volting. So was that log file from the old PSU? If so, then I think you have solved your problem. Over volting a cpu by 4.10 volts could certainly spike the cpu temp despite your water cooling and force a shutdown. And yes, a cpu heats up quick and cools down quick. You might be lucky you were water cooling as a fan cooled over volted cpu might not have lived despite an auto shutdown sequence. On the other hand, another reading in that same txt file can't have been right. The motherboard temp could not have been 187. That temp would have slagged everything and started a fire to boot. So either the board sensor is bad or it was a dodgy report. Was this Everest? Everest has been known to be off on board temps. in any case, I think you have fixed your own problem. Just buy a few little tie downs from Radio shack and wrap the whole affair up.
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
jamietonerConnect With a Mentor Commented:
Looks like a bad power supply to me. The overvolting could well be causing the overheating as pheidius posted and the +3.3v is dropping to 0. I would replace ASAP as a bad PSU can quickly fry other components.
0
 
nobusConnect With a Mentor Commented:
Did you have those readings with both Power supplies? if so :
 the sensors on the mobo are bad - in that case, you need a new mobo; ask for a warranty replacement if so
if not, yes your PS can cause it; unless you have 2 different problems :
1- a PS dropping voltage
2- bad temp sensor
0
 
BabyFaceLimaAuthor Commented:
Thanks folks, you're input is much appreciated.

The log file relates to the new PSU as I wasn't using any temperature monitoring software until this happened.  The log file is from SensorsView by the way.

Before returning the PC,  I think I'll  temporarily put the old PSU back in the computer and see what transpires.  It'll be interesting to see what, if any, issues are logged.  As I use my PC for work it's a bit of a pain to have it sent away for repair.

I'll post back here in a day or so.
0
 
pheidiusConnect With a Mentor Commented:
Dang, the new psu. Wel,l as I said, one of these readings was impossible so now I am wondering about that utilities' accuracy. I would still RMA back the new one right away while your old one is in the though and restest.
0
 
BabyFaceLimaAuthor Commented:
I've split the 500 points amongst each of you as the issue wasn't really possible to 'discover' remotely.  Anyway, I hope that's OK with you. ;-)
0
 
BabyFaceLimaAuthor Commented:
Well guys, I think I've discovered the root of the CPU overheating problem.    
 
The reservoir pump has a cable coming out the back that terminates in the usual white molex connector.
It's one with just two out of the 4 pins in it.
 
The pin that's connected to the brown cable is moving backwards and forwards within the molex connector - I can see the silver collar that the cable is encased in near the end when the pin is pushed backwards.  My PC overheated a few times again this afternoon but it was after I'd installed a further power supply (!!!).  I've swapped power supplies with a friend who's just bought a Corsair TX750 and happened to mention that they found the cables too long for their mid-tower case and it ended up in a bit of a jumble.  As I think the cables on the Hiper PSU are a little on the short side we decided we'd swap - a happy coincidence I reckon.  Anyway, I thought I'd not connected it up correctly when it started to overheat again but on closer inspection I found the pump cable was a bit dodgy.  I've now pushed the cable into its connector and connected it up and really squeezed the two together but it might start shifting and I'm concerned it'll work its way loose after a while so I've now asked my PC supplier if they can provide a new pump cable or some other solution.

Thanks very much for your input.
 
0
 
nobusCommented:
that's a good reason for the overheating; AND good troubleshooting
0
All Courses

From novice to tech pro — start learning today.