Link to home
Start Free TrialLog in
Avatar of amigan_99
amigan_99Flag for United States of America

asked on

Cisco Nexus 7k - two modules keep over-heating

I have a Nexus 7k and two modules are overheating. Module 1 got all the way to 119c and
powered itself down. When module 1 went down module 4 decided to heat up and got to
104 as you can see below. The data center people supposedly moved around some insulation
to get better air to it but I'm not seeing an impact. I should probably open a TAC case but I've
been up many hours and just hope to work at this again in the morning. Perhaps one of the
experts here might have another thought? Is there any way this could be the fault of the
equipment based given the PS looks good, the fans look good etc?

4        MAC0Sn0(s2)     115             105         47         Ok
4        MAC0Sn1(s3)     115             105         48         Ok
4        MAC0-Buf0(s4)   115             105         55         Ok
4        MAC0-Buf1(s5)   115             105         56         Ok
4        MAC0-Buf2(s6)   115             105         68         Ok
4        MAC0-Buf3(s7)   115             105         70         Ok
4        MAC1Sn0(s8)     115             105         42         Ok
4        MAC1Sn1(s9)     115             105         44         Ok
4        MAC1-Buf0(s10)  115             105         47         Ok
4        MAC1-Buf1(s11)  115             105         67         Ok
4        MAC1-Buf2(s12)  115             105         50         Ok
4        MAC1-Buf3(s13)  115             105         44         Ok
4        Fwd0Sn0(s14)    115             105         87         Ok
4        Fwd0Sn1(s15)    115             105         87         Ok
4        Fwd1Sn0(s16)    115             105         85         Ok
4        Fwd1Sn1(s17)    115             105         85         Ok
4        Fwd2Sn0(s18)    115             105         82         Ok
4        Fwd2Sn1(s19)    115             105         82         Ok
4        Fwd3Sn0(s20)    115             105         61         Ok
4        Fwd3Sn1(s21)    115             105         61         Ok
4        QEng0Sn0(s22)   115             105         88         Ok
4        QEng0Sn1(s23)   115             105         88         Ok
4        QEng1Sn0(s24)   115             105         104        Ok
4        QEng1Sn1(s25)   115             105         104        Ok
4        QEng2Sn0(s26)   115             105         75         Ok
4        QEng2Sn1(s27)   115             105         75         Ok
4        QEng3Sn0(s28)   115             105         67         Ok
4        QEng3Sn1(s29)   115             105         67         Ok
4        Crossbar(s30)   115             105         61         Ok
4        LkU0Sn0(s31)    115             105         99         Ok

Fan1(sys_fan1)  N7K-C7018-FAN        1.0        Ok
Fan2(sys_fan2)  N7K-C7018-FAN        1.0        Ok
Fan_in_PS1      --                   --         Ok
Fan_in_PS2      --                   --         Ok
Fan_in_PS3      --                   --         Ok
Fan_in_PS4      --                   --         Ok
Fan Zone Speed: Zone 1: 0x5f Zone 2: 0x30 Zone 3: 0x9f

Mod  Ports  Module-Type                         Model              Status
---  -----  ----------------------------------- ------------------ ----------
1    6      10/40 Gbps Ethernet Module          N7K-M206FQ-23L     powered-dn

ho env power
Power Supply:
Voltage: 50 Volts
Power                              Actual        Total
Supply    Model                    Output     Capacity    Status
                                 (Watts )     (Watts )
-------  -------------------  -----------  -----------  --------------
1        N7K-AC-6.0KW              1143 W       6000 W     Ok
2        N7K-AC-6.0KW              1156 W       6000 W     Ok
3        N7K-AC-6.0KW               653 W       3000 W     Ok
4        N7K-AC-6.0KW              1155 W       6000 W     Ok


                                  Actual        Power
Module    Model                     Draw    Allocated    Status
                                 (Watts )     (Watts )
-------  -------------------  -----------  -----------  --------------
1        N7K-M206FQ-23L             N/A            0 W    Powered-Dn
2        N7K-M108X2-12L             477 W        650 W    Powered-Up
3        N7K-M108X2-12L             500 W        650 W    Powered-Up
4        N7K-M224XP-23L             654 W        795 W    Powered-Up
5        N7K-M224XP-23L             640 W        795 W    Powered-Up
6        N7K-F248XP-25E             318 W        450 W    Powered-Up
7        N7K-M206FQ-23L             625 W        795 W    Powered-Up
8        N7K-M206FQ-23L             628 W        795 W    Powered-Up
9        N7K-SUP2E                  145 W        265 W    Powered-Up
10       supervisor                 N/A            0 W    Absent
Xb1      N7K-C7018-FAB-2             52 W        150 W    Powered-Up
Xb2      N7K-C7018-FAB-2             57 W        150 W    Powered-Up
Xb3      N7K-C7018-FAB-2             57 W        150 W    Powered-Up
Xb4      N7K-C7018-FAB-2             51 W        150 W    Powered-Up
Xb5      xbar                       N/A          150 W    Absent
fan1     N7K-C7018-FAN              280 W        578 W    Powered-Up
fan2     N7K-C7018-FAN              148 W        422 W    Powered-Up

N/A - Per module power not available


Power Usage Summary:
--------------------
Power Supply redundancy mode (configured)                PS-Redundant
Power Supply redundancy mode (operational)               Non-Redundant

Total Power Capacity (based on configured mode)              18000 W
Total Power of all Inputs (cumulative)                       21000 W
Total Power Output (actual draw)                              4107 W
Total Power Allocated (budget)                                7210 W
Total Power Available for additional modules                 10790 W
ASKER CERTIFIED SOLUTION
Avatar of atlas_shuddered
atlas_shuddered
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of amigan_99

ASKER

Thanks. For module 4 it looks like objects out of 29 are hot?

4        QEng1Sn0(s24)   115             105         104        Ok
4        QEng1Sn1(s25)   115             105         104        Ok

Module 1 is powered down. I'll spin up the data center guys. Looks like they didn't
do anything over night.
Consultants put in insulation to make things better - and made things sharply worse.
Insulation in the rack?
That's what they tell me! They're in another state. I envisioned their putting a blanket around the 7k.
Wow! Insulation will make things worse. The kit emits heat. Are these consultants proper IT consultants?
It sure did make things worse. You could see in Observium maps the exact moment they installed it. A real production risk.
Not to put to sharp a point on it but....

What kind of idiot puts a blanket over an electrified, high heat, omni-directional air-flow piece of a equipment with the expectation that this will result in an environmental improvement?  Where they concerned your switch was going to get to cold and the electrons would begin to slow down?  Did they think the switch was being cooled with liquid nitrogen?

I'd do three things -

1.  I would tell them they are buying me two new blades and they are on the hook for any other equipment failures in that rack for the next 2-3 years.
2.  I'd tell them that I wasn't paying them a dime and in fact they may want to consider paying me hush money so that I don't drag them into court for negligence.
3.  I'd call the fire marshal and have their business license pulled on the grounds of attempted arson.

Just my thoughts on the matter.