amigan_99
asked on
Cisco Nexus 7k - two modules keep over-heating
I have a Nexus 7k and two modules are overheating. Module 1 got all the way to 119c and
powered itself down. When module 1 went down module 4 decided to heat up and got to
104 as you can see below. The data center people supposedly moved around some insulation
to get better air to it but I'm not seeing an impact. I should probably open a TAC case but I've
been up many hours and just hope to work at this again in the morning. Perhaps one of the
experts here might have another thought? Is there any way this could be the fault of the
equipment based given the PS looks good, the fans look good etc?
4 MAC0Sn0(s2) 115 105 47 Ok
4 MAC0Sn1(s3) 115 105 48 Ok
4 MAC0-Buf0(s4) 115 105 55 Ok
4 MAC0-Buf1(s5) 115 105 56 Ok
4 MAC0-Buf2(s6) 115 105 68 Ok
4 MAC0-Buf3(s7) 115 105 70 Ok
4 MAC1Sn0(s8) 115 105 42 Ok
4 MAC1Sn1(s9) 115 105 44 Ok
4 MAC1-Buf0(s10) 115 105 47 Ok
4 MAC1-Buf1(s11) 115 105 67 Ok
4 MAC1-Buf2(s12) 115 105 50 Ok
4 MAC1-Buf3(s13) 115 105 44 Ok
4 Fwd0Sn0(s14) 115 105 87 Ok
4 Fwd0Sn1(s15) 115 105 87 Ok
4 Fwd1Sn0(s16) 115 105 85 Ok
4 Fwd1Sn1(s17) 115 105 85 Ok
4 Fwd2Sn0(s18) 115 105 82 Ok
4 Fwd2Sn1(s19) 115 105 82 Ok
4 Fwd3Sn0(s20) 115 105 61 Ok
4 Fwd3Sn1(s21) 115 105 61 Ok
4 QEng0Sn0(s22) 115 105 88 Ok
4 QEng0Sn1(s23) 115 105 88 Ok
4 QEng1Sn0(s24) 115 105 104 Ok
4 QEng1Sn1(s25) 115 105 104 Ok
4 QEng2Sn0(s26) 115 105 75 Ok
4 QEng2Sn1(s27) 115 105 75 Ok
4 QEng3Sn0(s28) 115 105 67 Ok
4 QEng3Sn1(s29) 115 105 67 Ok
4 Crossbar(s30) 115 105 61 Ok
4 LkU0Sn0(s31) 115 105 99 Ok
Fan1(sys_fan1) N7K-C7018-FAN 1.0 Ok
Fan2(sys_fan2) N7K-C7018-FAN 1.0 Ok
Fan_in_PS1 -- -- Ok
Fan_in_PS2 -- -- Ok
Fan_in_PS3 -- -- Ok
Fan_in_PS4 -- -- Ok
Fan Zone Speed: Zone 1: 0x5f Zone 2: 0x30 Zone 3: 0x9f
Mod Ports Module-Type Model Status
--- ----- -------------------------- --------- ------------------ ----------
1 6 10/40 Gbps Ethernet Module N7K-M206FQ-23L powered-dn
ho env power
Power Supply:
Voltage: 50 Volts
Power Actual Total
Supply Model Output Capacity Status
(Watts ) (Watts )
------- ------------------- ----------- ----------- --------------
1 N7K-AC-6.0KW 1143 W 6000 W Ok
2 N7K-AC-6.0KW 1156 W 6000 W Ok
3 N7K-AC-6.0KW 653 W 3000 W Ok
4 N7K-AC-6.0KW 1155 W 6000 W Ok
Actual Power
Module Model Draw Allocated Status
(Watts ) (Watts )
------- ------------------- ----------- ----------- --------------
1 N7K-M206FQ-23L N/A 0 W Powered-Dn
2 N7K-M108X2-12L 477 W 650 W Powered-Up
3 N7K-M108X2-12L 500 W 650 W Powered-Up
4 N7K-M224XP-23L 654 W 795 W Powered-Up
5 N7K-M224XP-23L 640 W 795 W Powered-Up
6 N7K-F248XP-25E 318 W 450 W Powered-Up
7 N7K-M206FQ-23L 625 W 795 W Powered-Up
8 N7K-M206FQ-23L 628 W 795 W Powered-Up
9 N7K-SUP2E 145 W 265 W Powered-Up
10 supervisor N/A 0 W Absent
Xb1 N7K-C7018-FAB-2 52 W 150 W Powered-Up
Xb2 N7K-C7018-FAB-2 57 W 150 W Powered-Up
Xb3 N7K-C7018-FAB-2 57 W 150 W Powered-Up
Xb4 N7K-C7018-FAB-2 51 W 150 W Powered-Up
Xb5 xbar N/A 150 W Absent
fan1 N7K-C7018-FAN 280 W 578 W Powered-Up
fan2 N7K-C7018-FAN 148 W 422 W Powered-Up
N/A - Per module power not available
Power Usage Summary:
--------------------
Power Supply redundancy mode (configured) PS-Redundant
Power Supply redundancy mode (operational) Non-Redundant
Total Power Capacity (based on configured mode) 18000 W
Total Power of all Inputs (cumulative) 21000 W
Total Power Output (actual draw) 4107 W
Total Power Allocated (budget) 7210 W
Total Power Available for additional modules 10790 W
powered itself down. When module 1 went down module 4 decided to heat up and got to
104 as you can see below. The data center people supposedly moved around some insulation
to get better air to it but I'm not seeing an impact. I should probably open a TAC case but I've
been up many hours and just hope to work at this again in the morning. Perhaps one of the
experts here might have another thought? Is there any way this could be the fault of the
equipment based given the PS looks good, the fans look good etc?
4 MAC0Sn0(s2) 115 105 47 Ok
4 MAC0Sn1(s3) 115 105 48 Ok
4 MAC0-Buf0(s4) 115 105 55 Ok
4 MAC0-Buf1(s5) 115 105 56 Ok
4 MAC0-Buf2(s6) 115 105 68 Ok
4 MAC0-Buf3(s7) 115 105 70 Ok
4 MAC1Sn0(s8) 115 105 42 Ok
4 MAC1Sn1(s9) 115 105 44 Ok
4 MAC1-Buf0(s10) 115 105 47 Ok
4 MAC1-Buf1(s11) 115 105 67 Ok
4 MAC1-Buf2(s12) 115 105 50 Ok
4 MAC1-Buf3(s13) 115 105 44 Ok
4 Fwd0Sn0(s14) 115 105 87 Ok
4 Fwd0Sn1(s15) 115 105 87 Ok
4 Fwd1Sn0(s16) 115 105 85 Ok
4 Fwd1Sn1(s17) 115 105 85 Ok
4 Fwd2Sn0(s18) 115 105 82 Ok
4 Fwd2Sn1(s19) 115 105 82 Ok
4 Fwd3Sn0(s20) 115 105 61 Ok
4 Fwd3Sn1(s21) 115 105 61 Ok
4 QEng0Sn0(s22) 115 105 88 Ok
4 QEng0Sn1(s23) 115 105 88 Ok
4 QEng1Sn0(s24) 115 105 104 Ok
4 QEng1Sn1(s25) 115 105 104 Ok
4 QEng2Sn0(s26) 115 105 75 Ok
4 QEng2Sn1(s27) 115 105 75 Ok
4 QEng3Sn0(s28) 115 105 67 Ok
4 QEng3Sn1(s29) 115 105 67 Ok
4 Crossbar(s30) 115 105 61 Ok
4 LkU0Sn0(s31) 115 105 99 Ok
Fan1(sys_fan1) N7K-C7018-FAN 1.0 Ok
Fan2(sys_fan2) N7K-C7018-FAN 1.0 Ok
Fan_in_PS1 -- -- Ok
Fan_in_PS2 -- -- Ok
Fan_in_PS3 -- -- Ok
Fan_in_PS4 -- -- Ok
Fan Zone Speed: Zone 1: 0x5f Zone 2: 0x30 Zone 3: 0x9f
Mod Ports Module-Type Model Status
--- ----- --------------------------
1 6 10/40 Gbps Ethernet Module N7K-M206FQ-23L powered-dn
ho env power
Power Supply:
Voltage: 50 Volts
Power Actual Total
Supply Model Output Capacity Status
(Watts ) (Watts )
------- ------------------- ----------- ----------- --------------
1 N7K-AC-6.0KW 1143 W 6000 W Ok
2 N7K-AC-6.0KW 1156 W 6000 W Ok
3 N7K-AC-6.0KW 653 W 3000 W Ok
4 N7K-AC-6.0KW 1155 W 6000 W Ok
Actual Power
Module Model Draw Allocated Status
(Watts ) (Watts )
------- ------------------- ----------- ----------- --------------
1 N7K-M206FQ-23L N/A 0 W Powered-Dn
2 N7K-M108X2-12L 477 W 650 W Powered-Up
3 N7K-M108X2-12L 500 W 650 W Powered-Up
4 N7K-M224XP-23L 654 W 795 W Powered-Up
5 N7K-M224XP-23L 640 W 795 W Powered-Up
6 N7K-F248XP-25E 318 W 450 W Powered-Up
7 N7K-M206FQ-23L 625 W 795 W Powered-Up
8 N7K-M206FQ-23L 628 W 795 W Powered-Up
9 N7K-SUP2E 145 W 265 W Powered-Up
10 supervisor N/A 0 W Absent
Xb1 N7K-C7018-FAB-2 52 W 150 W Powered-Up
Xb2 N7K-C7018-FAB-2 57 W 150 W Powered-Up
Xb3 N7K-C7018-FAB-2 57 W 150 W Powered-Up
Xb4 N7K-C7018-FAB-2 51 W 150 W Powered-Up
Xb5 xbar N/A 150 W Absent
fan1 N7K-C7018-FAN 280 W 578 W Powered-Up
fan2 N7K-C7018-FAN 148 W 422 W Powered-Up
N/A - Per module power not available
Power Usage Summary:
--------------------
Power Supply redundancy mode (configured) PS-Redundant
Power Supply redundancy mode (operational) Non-Redundant
Total Power Capacity (based on configured mode) 18000 W
Total Power of all Inputs (cumulative) 21000 W
Total Power Output (actual draw) 4107 W
Total Power Allocated (budget) 7210 W
Total Power Available for additional modules 10790 W
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Consultants put in insulation to make things better - and made things sharply worse.
Insulation in the rack?
ASKER
That's what they tell me! They're in another state. I envisioned their putting a blanket around the 7k.
Wow! Insulation will make things worse. The kit emits heat. Are these consultants proper IT consultants?
ASKER
It sure did make things worse. You could see in Observium maps the exact moment they installed it. A real production risk.
Not to put to sharp a point on it but....
What kind of idiot puts a blanket over an electrified, high heat, omni-directional air-flow piece of a equipment with the expectation that this will result in an environmental improvement? Where they concerned your switch was going to get to cold and the electrons would begin to slow down? Did they think the switch was being cooled with liquid nitrogen?
I'd do three things -
1. I would tell them they are buying me two new blades and they are on the hook for any other equipment failures in that rack for the next 2-3 years.
2. I'd tell them that I wasn't paying them a dime and in fact they may want to consider paying me hush money so that I don't drag them into court for negligence.
3. I'd call the fire marshal and have their business license pulled on the grounds of attempted arson.
Just my thoughts on the matter.
What kind of idiot puts a blanket over an electrified, high heat, omni-directional air-flow piece of a equipment with the expectation that this will result in an environmental improvement? Where they concerned your switch was going to get to cold and the electrons would begin to slow down? Did they think the switch was being cooled with liquid nitrogen?
I'd do three things -
1. I would tell them they are buying me two new blades and they are on the hook for any other equipment failures in that rack for the next 2-3 years.
2. I'd tell them that I wasn't paying them a dime and in fact they may want to consider paying me hush money so that I don't drag them into court for negligence.
3. I'd call the fire marshal and have their business license pulled on the grounds of attempted arson.
Just my thoughts on the matter.
ASKER
4 QEng1Sn0(s24) 115 105 104 Ok
4 QEng1Sn1(s25) 115 105 104 Ok
Module 1 is powered down. I'll spin up the data center guys. Looks like they didn't
do anything over night.