Solved

computer randomly shuts off

Posted on 2006-11-28
9
489 Views
Last Modified: 2010-04-20
I've got a new custom built machine which I installed centos on.  It runs ok for an hour or two then randomly shuts off.  The only thing I can think of is that it's overheating but I've got pretty sufficient cooling on it, is there anyway to check temps in software + if it was overheating is Centos shutting it down or would it be the hardware that shuts it down.  I'm new to linux are there any logs or anything written to when a machine shuts down.  Any ideas on other things to check for I'm pretty stumped.
0
Comment
Question by:ICPooreman
9 Comments
 
LVL 2

Accepted Solution

by:
ctwaley earned 500 total points
ID: 18033253
You can only check for temperatures if the motherboard supports it........Generally, you can check the running temperature, right after it shuts down, through the BIOS.  There may be a page, or section, which shows the cpu and system temps, fan speed, voltage values, etc., in the BIOS....

If there is such a section in the BIOS, then most likely the shutdown is happening if the the system reaches critical temperatures.  That's only if that feature is enabled and what the cuttoff temps are set at......

For linux, the software solution would be to install 'lm_sensors', which is a command line tool to check temps (there should be a RPM package you can install).........But this will only work if the motherboard has the sensors built in to begin with........Their are GUI apps out which use lm_sensors, the most widely known one being GKrellM (which also monitors a lot of other systems stats, as well)......

Otherwise, if there is no sensor support, and overheating is the cause, then the system is shutting down on it's own, which means there will be hardware damage with continued use, and you most likely have a faulty board....

If CPU/System temperature is not the problem, then it could be a RAM problem.....This is easier to check by replacing the RAM stick with a known good one and see if the shutdown continues..........Or if there is more than one RAM installed, take them all out, except one, and try them one at a time to see if the problem persists.....

Also, physically check to see if all the fans are working properly and there is good airflow inside the case (no IDE ribbons causing airflow problems, etc)

Anyway, could you post your hardware setup, including if you're using rounded cables or ribbons for the hard drives and the number of fans being used?......This will make things a tiny bit easier to diagnose...... ;-)
0
 

Author Comment

by:ICPooreman
ID: 18033933
Bought it off of ebay here's what its got

CPU:        INTEL CORE 2 DUO E6600 2.4GHz DUAL CORE 1066MHZ FSB
Motherboard       :           GIGABYTE GA-945P-S3 MAINBOARD 1066MHz FSB SUPPORT
Memory       :       2GB DDR-2 533FSB (PC4200)  Memory
Video Card       :         256 MB nVIDIA Ge-Force 7300GS DVI/TV-OUT PCI EXPRESS VIDEO CARD
Hard Drive       :       250GB 7200RPM SATA II  Hard Drive
OPTICAL DRIVE       :         16X DUAL LAYER DVD-RW DRIVE / ADD A 1.44MB FLOPPY DRIVE FOR $11
Network Card       :       10/100 Fast Ethernet Network Controller
Sound Card       :       CMI 9739A 6 CHANNEL CODEC
Case       :       ATX Case  w/ Power Supply and Front USB Port

The case has only got one fan installed, the hard drive has a fan, and I'm a real dummy with hardware there's another fan installed running in the case I believe it's a heatsink but maybe I'm wrong. Most everything is connected with rounded cables and there's actually a good amount of free space in the box.
0
 
LVL 12

Expert Comment

by:ibu1
ID: 18034608
I think there is problem with Power supply.Change it
0
 
LVL 10

Expert Comment

by:ssvl
ID: 18035817
check the memory

first clean and swap the memory's
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 2

Expert Comment

by:ctwaley
ID: 18050848
Okay, your board supports temp sensors and the BIOS has a health status page to view the cpu and board stats.........So, for that particular cpu, the running temp should be mid 40s deg C (not F) for an air-cooled system, or less........If it's above 60 deg C, then you have an overheating problem......

What you do is reboot the machine after it's been running for a while, say an hour since that's when it has problems.  When it reboots, you will need to press the <Delete> key (you can tap the Delete key repeatedly until the BIOS screen comes up).  Then highlight the "PC Health Status" entry by using the down arrow and hit ENTER......On the next screen, you will see "Current CPU Temperature" about half way down..........To quit the BIOS, press the <Esc> key twice and don't save on exit......or just hit the reset button on the case if it has one.......This will tell you what's going on without opening the case......

If the temp's okay, then the next thing to check is the RAM (memory), which means opening up the case......There are four slots for the RAM sticks, and if it has 2 gigs of RAM, then there will be at least two in there........If you feel confident enough for that, I'll walk you through the process, else take it in to a computer repair shop (one that you trust) and have it looked at.......

---ctwaley
0
 

Author Comment

by:ICPooreman
ID: 18059779
Well when I first got into the bios it said the temp was around 60C but then within a minute or two it said it was down to 35C and stayed there.  Looks like it's overheating but why the drastic temp change within a minute or two?  
0
 
LVL 2

Expert Comment

by:ctwaley
ID: 18061625
When the OS is running and there are a lot of daemons running in the background (plus any apps being used), the CPU is working harder to handle all that traffic, thus running the CPU hotter..........When you're in the BIOS, the CPU has nothing to do and is just idling, allowing it to run much cooler........

If CentOS is anything like the old RedHat, there are probably a number of daemons running unnecessarily in the background, so you should kill anything you really don't need, and disable them in the startup.....

There should be an app to help you manage the startup scripts and I'm not sure what it's called for your distro.........I run Slackware normally, which uses a BSD-style startup routine, not SysV, but I have run Debian (a while back), which uses the SysV startup routine and had a helper app to manage the startup scripts (although I usually managed them by hand ;-) ).....

Anyhow, reducing the number of background process will help lighten the load for the CPU and I suggest installing lm_sensors, along with GKrellM, to monitor the stats in real time while running the OS...........Besides these two, you will also need to make sure you have the correct "I2C" modules loaded for your particular chipset that lm_sensors relies on for retrieving the needed info..........

One of the most common cause(s) for overheating, which is easily fixable, would be the CPU fan not working properly (running too slow)......Even a slight drop in fan speed can be critical, especially if the heatsink was improperly installed (such as no thermal compound between the heatsink and cpu surfaces).......Since you got this PC from eBay, you have no assurance it was assembled properly, unless there was a guarantee or some sort of warranty which came with it..............If so, I suggest returning it and have it replaced, if possible......

But all is not lost if no permanent damage has happened so far.................It will require physically inspecting the machine and closely monitoring of the stats............If the inside of the case was pretty dusty when you first opened it up, that would be a good indication you taken in..........But, even if it was clean, that wouldn't necessarily tell you much, either........What you need to do is check up on the seller you bought it from through eBay, to see what kind of report there is for that seller......

Bottom line is, if you can't get it replaced, then it's just a process of elimination to narrow down the cause of the problem......

Another stupid question............The PC isn't located near any source of heat, such as an heater vent, is it..........and is there plenty of air circulation outside the case?.........
0
 

Author Comment

by:ICPooreman
ID: 18069031
<<The PC isn't located near any source of heat, such as an heater vent, is it
no it's actually in a fairly cool room not next to any source of heat

<<If the inside of the case was pretty dusty when you first opened it up,
no, it's  pretty clean inside the case.

I'll check up with the seller to see if there is anything they'll do.  They actually have a really good rating so I'm a little surprised.  
0
 
LVL 2

Expert Comment

by:ctwaley
ID: 18071043
>  They actually have a really good rating so I'm a little surprised.

With a good rating, you should not have much problems, hopefully........Even if they did some testing before shipping, not all problems will be encountered, so it's not really surprising to find something amiss every once in a while (even for the big vendors).........Which is where RMAs come into the picture...... ;-)
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

rdate is a Linux command and the network time protocol for immediate date and time setup from another machine. The clocks are synchronized by entering rdate with the -s switch (command without switch just checks the time but does not set anything). …
I am a long time windows user and for me it is normal to have spaces in directory and file names. Changing to Linux I found myself frustrated when I moved my windows data over to my new Linux computer. The problem occurs when at the command line.…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Get a first impression of how PRTG looks and learn how it works.   This video is a short introduction to PRTG, as an initial overview or as a quick start for new PRTG users.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now