Solved

IBM 3650 Component temperature thresholds

Posted on 2014-04-14
6
1,078 Views
Last Modified: 2014-04-28
What are the recommended temperature thresholds for IBM x series servers specifically memory CPU But including all others
0
Comment
Question by:gurutc
  • 2
  • 2
  • 2
6 Comments
 
LVL 62

Accepted Solution

by:
☠ MASQ ☠ earned 250 total points
Comment Utility
IBM publish operating thresholds for all their servers in their online documentation.

(Search for server model number and "operating threshold")

These are generalisations for all components,  If you want specific limits for CPUs refer to the manufacturer's data sheets (Intel has a comprehensive index of these)

You also asked about thresholds for RAM  but you would be better using the IBM general figures for this as RAM doesn't use thermal cut out.

For the 3650 in your title for example:

Operating environment

    Temperature — Server on
        10.0° to 35.0°C (50° to 95°F) at 0 to 914.4 m (0 to 3,000 ft)

        (Decrease system temperature by 0.75°C for every 1000 feet increase in altitude.)

    Temperature — Server off
        10° to 43°C (50.0° to 109.4°F)
0
 
LVL 16

Author Comment

by:gurutc
Comment Utility
I found the numbers you quote, however, these are 'ambient' system values.  DIMM memory temps as well as CPU temps on servers running fine fluctuate up to 42 C.  

My bad server has some DIMM temps ranging up to 60C and 46C CPU temps.  As you would guess, it's having issues.  The Best-Practices values for these components would have to be known by IBM for the Intel CPUs 'as installed'.  My bad server is out of range I'm sure.  But the 'operating environment' temp range is certainly not valid for the CPUs and memory.  Even with these ridiculously high  CPU and memory temps on my bad server it is not throwing a temp warning.  Sooo, somewhere IBM must have the 'known-good' upper temp range values.

- gurutc
0
 
LVL 62

Expert Comment

by:☠ MASQ ☠
Comment Utility
So is what you are asking actually - why does my server run at this temperature?

For the M4 version both the RAM (standard DRR3 or Max5 v2) and CPU (e5-2600v2) are rated at over 80ºC but the server threshold is meant to be set at 50ºC.  Of course once CPU gets hotter you're going to see a drop in performance but you're a long way from the tolerance limits of the Xeon

There are some old firmware versions that had thermal control problems but if you're running IMM2 v.3.35 or better you shouldn't see that.

I'm assuming the system fans have already kicked in at the temps you're describing.
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 16

Author Comment

by:gurutc
Comment Utility
Thanks for your response.  I'm observing that CPU and memory temps on properly functioning servers are in the IBM operational range.  But actual observation reveals that IBM sets some threshold values for CPU and DIMM temps that cause a shutdown.  This is proved by the fact that there's a firmware update to fix a DIMM memory too low temp threshold shutdown.  This value is below the CPUs' and DIMMs' max temp limit.  Someone at IBM must know what these values are.  On my problem system the 'system' temp is below threshold and the TEMP led is not currently lit even though my CPUs and DIMMs are problematically and performance-affecting overly hot.  My guess is 50C for the CPUs but that's a guess, and I have no idea for the DIMMs.  What are the numbers?

And another issue:  Because the Broadcom network chipsets as well as the chipsets for installed QLogic HBAs have a max temp operational range of about 35C.  All is fine and good for CPUs and DIMMs to take a sauna bath, but the other components in the server located right next to the CPUs and DIMMs can't handle the heat coming from the CPUs and DIMMs.

I know, just open up the box and fix the airflow baffles.  That's coming during our Sunday maintenance window.  But I plan to check all of our thousands of IBM servers for CPU and DIMM temps even though none show a TEMP indicator issue.  IBM sets the operational temp values to ensure performance and reliability.  I suspect that un-detected CPU and DIMM heat issues are causing a huge performance hit in my environment.

So simply, what are the Over-Temp threshold values that cause a shutdown in IBM X Series Servers dadgummit?

- gurutc
0
 
LVL 61

Assisted Solution

by:gheist
gheist earned 250 total points
Comment Utility
Highest CPUs bring ambient temperature threshold down to 25C
0
 
LVL 61

Expert Comment

by:gheist
Comment Utility
It is in service guide...
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

System overheating may become a serious problem if not taken care of at the proper time. I am writing this article because I faced a similar problem. Intro All electronic devices produce heat, but computers are a special case - the processors bo…
Moving your enterprise fax infrastructure from in-house fax machines and servers to the cloud makes sense — from both an efficiency and productivity standpoint. But does migrating to a cloud fax solution mean you will no longer be able to send or re…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now