Link to home
Start Free TrialLog in
Avatar of Abraham Deutsch
Abraham Deutsch

asked on

Verify reason of server overheating

Have a server that overheats and shots down. I opened the box and the (4) hard drives are burning hot.

The room is about 70F (there is another server in the same room which has no issue)

The box has two small fans in the back of the box and the hard drives are in the front of the box with no space for ventilation.

Before a change the box is there a way to verify that the box is the issue, or is it something else causing it to overheat (I don’t want to change the box on the expense of the client and it should continue happening).
Avatar of Member_2_231077
Member_2_231077

Presumably you've already blown the dist out and checked cables aren't restricting airflow.

One of the disks could be faulty and the heat generated by that one gets transferred to the others via the metalwork, the box was presumably designed to have the hard disks at the front.
Avatar of rindi
Servers normally have remote management hardware and monitoring tools installed which allows you to monitor things like overtemps, raid failures etc. Make sure those things are configured and monitored. Also make sure the fans really are working. Besides that, servers usually have a complete set of fans, not just 2.
Avatar of Abraham Deutsch

ASKER

In normal circumstances I would booth the server with one of the programs that check the hard drive but in this case I'm afraid the server will overhead and shot down in mid of the test, is there something you recommend to use to checking it by attaching it to another PC externally?
Normally you can test disks on servers anyway via boot media, as the disks are behind a RAID controller and the diagnostic utility won't even see the disk. Your RAID controller should show you the SMART status of the connected disks, and if it has been tripped, it should also tell you to replace it. On the other hand, fast Disks built for servers will also get hooter and that is normal, particularly with old models.
So how is server hard drives tested?

I searched online for for information about this manufacture of the server "american megatrends" I've came accross the following post http://www.techsupportforum.com/forums/f15/solved-cpu-over-temperature-error-but-is-it-really-619511.html thy recommend using arcticsilver and the one having the issue claims it solved it by him.
Normally the disk itself does the basic testing, and the RAID controller shows you the results. For temperature that is also quite good enough.

Arctic Silver is a thermal Transfer paste. All it does is to reduce the thermal insulation to the chassis. That alone won't get rid of the heat. The disks might not really be the problem, if they are enterprise class disks they should be built to work at high temps. But those temps will of course also heat up the rest of the server, and if there aren't enough fans or they rotate too slowly, that can still cause heat problems.
All fans are working. And once the temperature starts to rise thy go up to full speed
I've looked around in the bios for information on the health of the hard drive but I did not find anything about it see some snapshots attached
IMG_1568.JPG
IMG_1569.JPG
IMG_1570.JPG
IMG_1571.JPG
IMG_1572.JPG
IMG_1573.JPG
IMG_1574.JPG
ASKER CERTIFIED SOLUTION
Avatar of rindi
rindi
Flag of Switzerland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That is an ASUS Z97-WS motherboard.  Workstation motherboard and not a server motherboard.

Probably a standard case as well and not a server type case.

If this is a standard type case then open it up and get a big desktop fan blowing into it's guts while it's running.  Monitor the temperatures.  If they come down then the case is the problem.
This was my initial thought as to the fact that 30 or more cameras run on this machine. I hesitated to change it since the security company disagreed with me, and if it would not solve the issue he would blame me for bad diagnostic, Now that I have your opinion Ill go ahead and change the case. will post the results.
Try the blg fan blowing into it first for confirmation.
I am not sure what you are referring to when saying a "blg fan"
He is talking about one of those propellers which you normally set up on your desk, which can be set to move the direction back and forth. Personally I don't think that is necessary. If you aren't using a real server built for the task, then you are using the wrong hardware and should not just change the case, but rather the server.
That's the fan.  That will blast enough air into the box to keep stuff cool.  If it doesn't overheat doing that then the box or how it is configured is poor.
Since its after the fact I don't believe thy will take my recommendation for such expense, if with a new box I can solve the heat issue, I can close the ticket. (Ill open a new one when the next issue comes up. sometime you just need to know your client some of them are very knowledgeable.....)
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I ran a performence data collector, which if showing bad performance I will use it to show client that he needs a server. see link

https://www.dropbox.com/sh/jyi4ukcytlmdu5g/AAD9KJNTGNLOimF4k3VQaQ4Ja?dl=0
WD40PURX-64GVNY0 are the right disks for security cameras, but make sure there isn't anything else but the camera streams on them as they're not designed for random access, they're meant as replacement for tape drives.

A photo of the case wouldn't go amiss.
See attached image of the box
Note the front is solid no place for air
In the back 2 fans

The hard drive and the heat sink are extremely hot (my be the one online suggesting changing the heat sink fan does have a point)

I also attached a report of the temperature taking when had no load
IMG_1576.JPG
IMG_1577.JPG
IMG_1578.JPG
IMG_1579.JPG
IMG_1580.JPG
Sensors.txt
Please verify that both of the fans at the back are blowing outwards (the air does go in the front, that's meant to be a grill in front of the disks, it isn't meant to be solid).

Please inform us which way fan 1 (added on top) is blowing and whether there are air holes in the case there or if it's just trying to suck or blow the metalwork off.
In the front, there is no grill it solid (sealed no air can go thought)

fan 1 added on top is behind the cover which at that spot has a grill.
That's an exhaust fan.

If it was a reasonable case that front cover should come off and there should be provision for mounting input fans there.
The cover comes off but there there is no slots to add a fan in the box. the one I added I attached it with  Cable Zip Ties inside the box at the top right behind the over it the area where the cover has a grid.
Will the server run with only one of the disks powered up - ie does it have a  boot volume?  Can you temporarily stop the camera recording to reduce the processing load too?

That should mean you have a cooler, more stable system that you can start to test.  
Tools like Speccy - https://www.piriform.com/speccy
or SpeedFan - http://www.almico.com/sfscreenshots.php
can then be run to give you some temperature values on both the disks and cpu, and also tell you fan spin speeds.  You can then see whether these are acceptable.  Some hard disks do run hot, and seem to be fine like that.  

You should also check the SMART diagnostics on those drives as well.  Once those baseline tests are done, try adding more drives/processor load, and see how the stats go.
See attached text file of Temperature reading with out any load.
Temperature-reading.txt
IMG_1579.JPG shows the front grill, if it's really solid rather than vented then why design it to look like a vented grill?

I recommend you find out who designed that front "grill" and then go around to their house and break the "grill but it isn't really a grill" up into little pieces and shove up their bottom. At least then it won't be blocking the air flow into the front (although it may be blocking air flow out the back passage).

At least follow dbrunton's recommendation of removing that non-grill (throw it in the dustbin), I doubt you'll need to add front fans after that air-blocker is removed.
[removed by dbrunton]

Still need to know which way the fan you added is blowing.
I moved it to a case with good fans in grills and working perfectly thank you all. PS Thank you for making me aware that this is a mix of server and workstation.