No Connection Light on Switch

Hi everyone. Strange VMware problem we are having here and looking for idea.

vSAN running two hosts and a witness. For the vSAN portion, each node is connected to a Cisco 3850 via a pair of twinax cables. On one node there are no lights on the switch where the twinax connect. This was working before. I have tried the cables elsewhere and they work. I have tried other ports on the 3850 and it doesn't work there either. Dell just replaced the network card and it didn't help. On the switch, "sh int status" shows the twinax is plugged in, but not connected. Same for the server.

VMware shows nothing unusual other than the connection is down.

Again, this was working. Problem is only one node. The other one is fine as is the witness.

Anything else?
LVL 27
Brian BEE Topic Advisor, Independant Technology ProfessionalAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
We've got a switch here, which does the same thing.

and it's something to do with the Network Interface bringing up the connection to the physical port.

Are ALL the hardware components on the HCL, and firmware updated ?

I would try replacement switch, nics and cables
Brian BEE Topic Advisor, Independant Technology ProfessionalAuthor Commented:
Are ALL the hardware components on the HCL, and firmware updated ? I would try replacement switch, nics and cables

Yes, all on the HCL. Firmware, yes. I actually had to update the firmware in order to get the switch virtual stacking working on these 3850s. NIC was just replaced, cables I confirmed work elsewhere. Replacement switch? We had enough trouble getting the two that we have. I guess I could swap the switches. Since they are virtually stacked and everything plugs into the same ports on both switches, it *should* be plug and play, but I'll have to schedule an outage.

Going to try a couple more cable/port combinations first though.

Any other ideas still welcome!
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Not a solution, and not sure why with ours, if we unplug and re-plug the cable in, sometimes, the LINK is established!
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Brian BEE Topic Advisor, Independant Technology ProfessionalAuthor Commented:
So a couple of more items we checked...

Plug both ends of the twinax into the server port. That produced green connection lights. Therefore the NIC is most likely good.
Plug the twinax in between an unused port on the 2960 and the 3850. That didn't work. So probably is most likely the 3850.

Finally, we took the whole ESX host, cables and all over to the other server room and plugged it into the same switches as our other ESX host. Success! All the lights came on as expected. No geo redundancy of course, but at least we have server redundancy again.

Those facts were enough to convince Cisco to RMA the switch.
Brian BEE Topic Advisor, Independant Technology ProfessionalAuthor Commented:
Thanks for the information Andy. For some reason, I don't seem to be able to get the point slider to work. Hope it works this time.
Brian BEE Topic Advisor, Independant Technology ProfessionalAuthor Commented:
Further information... Replacement switch didn't work either. Tried troubleshooting over several days. The third Cisco tech I spoke to finally noticed a problem with these commands:

Switch01#show redundancy
Redundant System Information :
------------------------------
       Available system uptime = 4 weeks, 2 days, 13 hours, 17 minutes
Switchovers system experienced = 0
              Standby failures = 1
        Last switchover reason = none

                 Hardware Mode = Simplex
    Configured Redundancy Mode = sso
     Operating Redundancy Mode = Non-redundant
              Maintenance Mode = Disabled
                Communications = Down      Reason: Failure

Switch01#show switch
Switch/Stack Mac Address : ******* - Local Mac Address
Mac persistency wait time: Indefinite
                                             H/W   Current
Switch#   Role    Mac Address     Priority Version  State
-------------------------------------------------------------------------------------
1       Standby  *****     1      V02     HA sync in progress
*2       Active   *****     1      V02     Ready

... So the second switch had been stuck syncing for over a day and never came fully online. The solution was to change which switch was active:
redundancy force-switchover
...This caused the switches to reboot and Switch#1 came up as the master and after a short delay everything started working normally. All the connected ports showed connections.

So Andy, to your earlier comment, this may be why restarting the problem switch sometimes fixes the problem. Perhaps the master switch in the stack changes?

According to the Cisco tech, she had seen this happen before. So maybe it's documented out there somewhere already, but I never found it. I'm going to try and change this to the solution.
Brian BEE Topic Advisor, Independant Technology ProfessionalAuthor Commented:
Sorry for the confusion. Further information shows the above was the solution.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
vsan

From novice to tech pro — start learning today.