• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1559
  • Last Modified:

IBM 3650 Component temperature thresholds

What are the recommended temperature thresholds for IBM x series servers specifically memory CPU But including all others
0
gurutc
Asked:
gurutc
  • 2
  • 2
  • 2
2 Solutions
 
☠ MASQ ☠Commented:
IBM publish operating thresholds for all their servers in their online documentation.

(Search for server model number and "operating threshold")

These are generalisations for all components,  If you want specific limits for CPUs refer to the manufacturer's data sheets (Intel has a comprehensive index of these)

You also asked about thresholds for RAM  but you would be better using the IBM general figures for this as RAM doesn't use thermal cut out.

For the 3650 in your title for example:

Operating environment

    Temperature — Server on
        10.0° to 35.0°C (50° to 95°F) at 0 to 914.4 m (0 to 3,000 ft)

        (Decrease system temperature by 0.75°C for every 1000 feet increase in altitude.)

    Temperature — Server off
        10° to 43°C (50.0° to 109.4°F)
0
 
gurutcAuthor Commented:
I found the numbers you quote, however, these are 'ambient' system values.  DIMM memory temps as well as CPU temps on servers running fine fluctuate up to 42 C.  

My bad server has some DIMM temps ranging up to 60C and 46C CPU temps.  As you would guess, it's having issues.  The Best-Practices values for these components would have to be known by IBM for the Intel CPUs 'as installed'.  My bad server is out of range I'm sure.  But the 'operating environment' temp range is certainly not valid for the CPUs and memory.  Even with these ridiculously high  CPU and memory temps on my bad server it is not throwing a temp warning.  Sooo, somewhere IBM must have the 'known-good' upper temp range values.

- gurutc
0
 
☠ MASQ ☠Commented:
So is what you are asking actually - why does my server run at this temperature?

For the M4 version both the RAM (standard DRR3 or Max5 v2) and CPU (e5-2600v2) are rated at over 80ºC but the server threshold is meant to be set at 50ºC.  Of course once CPU gets hotter you're going to see a drop in performance but you're a long way from the tolerance limits of the Xeon

There are some old firmware versions that had thermal control problems but if you're running IMM2 v.3.35 or better you shouldn't see that.

I'm assuming the system fans have already kicked in at the temps you're describing.
0
Get 10% Off Your First Squarespace Website

Ready to showcase your work, publish content or promote your business online? With Squarespace’s award-winning templates and 24/7 customer service, getting started is simple. Head to Squarespace.com and use offer code ‘EXPERTS’ to get 10% off your first purchase.

 
gurutcAuthor Commented:
Thanks for your response.  I'm observing that CPU and memory temps on properly functioning servers are in the IBM operational range.  But actual observation reveals that IBM sets some threshold values for CPU and DIMM temps that cause a shutdown.  This is proved by the fact that there's a firmware update to fix a DIMM memory too low temp threshold shutdown.  This value is below the CPUs' and DIMMs' max temp limit.  Someone at IBM must know what these values are.  On my problem system the 'system' temp is below threshold and the TEMP led is not currently lit even though my CPUs and DIMMs are problematically and performance-affecting overly hot.  My guess is 50C for the CPUs but that's a guess, and I have no idea for the DIMMs.  What are the numbers?

And another issue:  Because the Broadcom network chipsets as well as the chipsets for installed QLogic HBAs have a max temp operational range of about 35C.  All is fine and good for CPUs and DIMMs to take a sauna bath, but the other components in the server located right next to the CPUs and DIMMs can't handle the heat coming from the CPUs and DIMMs.

I know, just open up the box and fix the airflow baffles.  That's coming during our Sunday maintenance window.  But I plan to check all of our thousands of IBM servers for CPU and DIMM temps even though none show a TEMP indicator issue.  IBM sets the operational temp values to ensure performance and reliability.  I suspect that un-detected CPU and DIMM heat issues are causing a huge performance hit in my environment.

So simply, what are the Over-Temp threshold values that cause a shutdown in IBM X Series Servers dadgummit?

- gurutc
0
 
gheistCommented:
Highest CPUs bring ambient temperature threshold down to 25C
0
 
gheistCommented:
It is in service guide...
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 2
  • 2
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now