Link to home
Start Free TrialLog in
Avatar of bunkhill
bunkhill

asked on

Dell PowerEdge 1950 Fans Screaming

Good Morning,
I have a Dell PowerEdge 1950 Server.  After a power outage I brought the server back up and the fans started screaming louder than ever and havent slowed down since (been a couple of months).  I have blown out the dust, cleaned each of the fans and made sure the cover was securely on and nothing seems to have gotten them back to their pre-outage level.  The fans are running at full speed. I have installed OpenManage Server Administrator and there is no temperature settings that I could manipulate.  I am running Windows Server 2003 R2, I have updated to what I believe is the newest BIOS (2.7.0).  The only thing I havent done is update drivers.  There is nothing in the event logs that points to any thermal issues, I am stumped.  Does anyone have any thoughts or ideas because the sound is crazy?  The room is cooled and like I said the server was not even half as loud before this outage.

Thank you

Eric
Avatar of PowerEdgeTech
PowerEdgeTech
Flag of United States of America image

"I have installed OpenManage Server Administrator and there is no temperature settings that I could manipulate."

No, you can't set speeds or anything, but does OMSA indicate that a fan has failed?  Is the status LCD on your server amber or blue?  The fans will speed up if the cover is off (or not on completely) or if a fan has failed.

Also, the BIOS does not control the fans - the ESM does.  Whenever you update the BIOS, you need to make sure you update the ESM firmware as well.  It would also be wise to update the other system firmware - PERC, HDD, NIC's, DRAC, etc.
Avatar of bunkhill
bunkhill

ASKER

There is no indication of a fan failure in OSMA, although being new to OSMA Im not really sure where to look..  I am not sure what the color is of the status LCD, (I want to say amber but I am not on-site so I am not sure).  I will try updating the ESM and other system firmware as well but its weird that this only occurred after an outage so I am not sure that the firmware is the problem.
In OMSA, you can check the hardware logs by clicking on System, then the Logs tab.
You can check the fans specifically by going to System, Main System Chassis, Fans.

"I am not sure that the firmware is the problem"

Probably not, but if none of the fans have failed, then there is a remote possibility that the ESM became corrupted/damaged from the power outage, an update to correct the code may be all it needs.  However, a failed fan is much more likely.  Amber for the LCD indicates a hardware error - it would be scrolling a related message.
Are there any errors? Lost redundancy.
Look at the FAN speeds and see what that reports.
I think that is what the suggestion to make sure no fan failed.
what temperatures does omsa report?
No temp is reported by OSMA, or if there is a way for it to report it I don't know how to
When you are using the web interface, there is a temperature subcategory, fans, bios, etc.
Expand the options
Which version of OMSA are you using? 5,6,7?
In OMSA, you can check the hardware logs by clicking on System, then the Logs tab.
You can check the fans specifically by going to System, Main System Chassis, Fans.
What did you find?

Temperature should be found in OMSA under System, Main System Chassis, Temperature.
replacing the fans looks like a good solution - look here how it's done :
http://www.brentozar.com/archive/2010/01/how-to-make-a-dell-poweredge-quieter/
the key here is to figure out why the fans are full rpm.  the only way this can happen is either faulty temperature sensors or faulty fan speed controls.  this is why you need to know what the temperature sensors onboard the motherboard and cpu are reporting.  omsa does report these (see attached)

you can also download a temp app of some sort (maybe core temp)

do you know if the cpu fans are at full rpm also or just case fans?

if your reported temps are normal then the motherboard does not think so.  i don't think any of these sensors are wired anywhere and are incorporated into the cpu and motherboard.  re-seat the cpu's. check all connectors and maybe re-seat or tighten.  check bios settings and see what options you have there.  a lot of times the bios will have fan rpms and temp settings displayed.  if you're in the bios and it show fan rpm at 800 but they are at full blast then the feedback from the fans is incorrect or not getting to the motherboard (usually the 3rd wire on the fan connector is the fan speed feedback)

post your findings please
nobus ... I'm assuming he is not talking about the "normal" loud fan noise of a 1U server and/or the 1950 specifically.  I'm assuming since he indicated that they started after a power outage that he is talking about a new "loud".  Putting in quieter fans only ignores the real problem, unless of course we are talking about a fan failure (which would cause the other fans to spin faster to compensate), in which case replacing all of them would work but is overkill.  Also, that link is talking about the 1900 - a tower server, not the 1U 1950 rack server.

If a fan has failed, then it should show in OMSA in the Fans section (special software should not be needed above/beyond OMSA), and a note of the failure should be in the Hardware Logs, but the OP has yet to check this (or at least to report what he has (or has not) found).  Also, there has been no report on whether or not he has updated the ESM/BMC (which actually controls the fans) after updating the BIOS (which does not control the fans).
I am using OSMA 7.0.0.0
My apologies for being so slow with all of the information I have not been able to check anything out until this morning.  When I go into OSMA the logs are predicting a disk failure (which I was aware of but also didnt think that would cause the fans to go crazy) but that is all.  The LCD on the front is Amber.  There is no where in OSMA for me to check on the fans (at least in the version I am using).  I am working on upgrading the EMS firmware now and will report back after I have done so.  And everything Nobus said is true, it is a new "loud" after the power outage not a regular loud.
Every time I run the ESM/BMC firmware update I get an error stating "This update package is not compatible with your system configuration". Is there something that I am missing here?
i did not say that, sorry
so you're not looking for a fan replacement as  solution?
I am not sure what I am looking for.  I am trying to upgrade the EMC as recommended and it is not allowing me to (see above) and none of the logs are indicating a fan failure (at least that I can see.  I do not see fans in OSMA) I have reseated all of the fans previously and that did not help
I tried installing BMC firmware version 2.50 A00 and it did not work and I am trying to install 1.22 A04 and it is getting the same error
You should be running this file for the ESM update:
http://ftp.dell.com/FOLDER00928797M/1/1950_ESM_Firmware_XCVN0_WN32_2.50_A00.EXE

The update will take up to 10 minutes to run.

If OMSA is not showing you fans or temps, then it may be because the ESM is out of date compared to the version of OMSA you are using.  It being out of date could also be preventing the logs from showing correctly.

If the LCD is amber, it should be scrolling an error code and message for any errors it is logging.

User generated image
What is your ESM firmware currently at?  Do you have SP2 installed?
That is the ESM update that I am trying to run and it keeps returning the error:  This update package is not compatible with your system configuration
I do have Service Pack 2 installed.  Excuse the ignorance but how can I check the version of the EMC?
That's ok ... the version can be found in the same list as the Fans and Temps, under Firmware.

There is the possibility that the power outage permanently damaged your ESM/BMC, which is why the fans are now running rampant (the ESM controls them), OMSA is not showing details system information managed by the ESM, and the ESM update will not install (cannot see, recognize, speak to the ESM chip).
The only place that I have a "Firmware" option is under 'Storage'.  So if the ESM is in fact damaged what recourse/repair options are there?
ASKER CERTIFIED SOLUTION
Avatar of PowerEdgeTech
PowerEdgeTech
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I am remoting in so I am not sure what is being displayed.  I will ask onsite person to check and let you know
Are you remoting in as a system or domain admin?
System
Is there a surefire way for me to know if the motherboard is the problem?  I'd hate to spend the money and still be in the same spot
No, but it has been narrowed down pretty close.  The ESM is on the motherboard, and pretty much none of the ESM functions are working:  

logs (system has an amber light - should be reflected in the log)
fans (spinning at max with no indication of warning/error)
incomplete OMSA data (missing pretty much everything that comes from the ESM - fans, temps, firmware, etc.)
the failed update (can't find it, so says the update doesn't match configuration)

Did you ever get the LCD readout?
LCD is displaying nothing
Just amber, no message?  The LCD's output is also controlled by the ESM.
So yet another sign that its the motherboard. How can I tell if it's a Gen I, II, or III motherboard?
Look on the Poweredge 1950 LCD enclosure or the lip of the ears in the rack.
User generated imageUser generated imageYou can also get it from OMSA, but that might not be reliable right now :)
Changing the motherboard tomorrow night will let you know how I make out.  Should I have any concerns with this swap (meaning it's pretty standard right?)

Thanks
Yeah, pretty standard - just make sure everything gets moved over (riser cards, etc.).
Changed the motherboard last night and everything is working great! Was actually a lot easier than I thought (one of the nice things about Dells).  Thank you again PowerEdgeTech for helping me out on this one.
No problem ... glad you're back up and running :)