Solved

ESX 5.0 host disconnected from vCenter 5.0, troubleshooting and RCA

Posted on 2013-10-24
6
843 Views
Last Modified: 2014-01-14
The other night a host disconnected from our vCenter server.  I was able to RDP to the host and the 4 VMs below.  It was rebooted via ILO and then became completely inaccessible.  After pulling out and reseating the blade  BL460c, I was able to reconnect the host from vCenter and the VMs again became accessible.  Is there good CLI commands to do a root cause analysis?  And/or anywhere other than ILO and the Tasks&Events tab to get troubleshooting information?  

Rebooting the CIM service and enabling SSH on the host is being blamed and I think disconnection issue was hardware related.  So far, I looked at right click and "report performance..." and can only see when the host was disconnected.  Also, same thing in the tasks&events, I see when it lost connection "host is not responding" but that's it.  The other spot I looked was the management log in the ILO (HP ILO2), and only found: POST Error: 1794-Drive Array - Array Accelerator Battery Charge Low.  Date was after issue happened.
0
Comment
Question by:emjay180
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 1

Expert Comment

by:Avinash21
ID: 39598446
Hello when you say the Host was able keep the VM's running. and you were able to RDP .? which kind of point us to the direction that the host Itself was fine and Just the Management agents would have stopped responding or crashed down.

This could happen if you have had any storage related problem. May be you might have hit an APD situation.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684


Thanks,
Avinash
0
 

Author Comment

by:emjay180
ID: 39598554
Basically, on vCenter, the host and the 4 VMs that were residing on that host showed up as disconnected on vCenter browser.  But, the host and VMs themselves were accessible via remote console/RDP.  Then, another member of team did and ILO console to the host and rebooted it.  At that point, nothing was accessible ILO/RDP along with still displaying as disconnected in vCenter.  

Only reseating the blade and powering it back on brought the host back online from a vCenter perspective.

Also, I suppose I should mention, the other 11 ESX hosts were fine and accessible from vCenter while this was going on.  

I did also notice the local datastore for that host was inaccessbile, but that would probably be expected.  

Thanks!
0
 
LVL 120
ID: 39598652
if this happenes again:-

1. Restart vCenter Server Service.

2. On the ESXi Host - Restart Network Management Agents

both of the above will not affect the running VMs on the host.
0
Why You Need a DevOps Toolchain

IT needs to deliver services with more agility and velocity. IT must roll out application features and innovations faster to keep up with customer demands, which is where a DevOps toolchain steps in. View the infographic to see why you need a DevOps toolchain.

 

Author Comment

by:emjay180
ID: 39600791
Thank you for tips on how to remedy situation next time.  If I restart vCenter service, would that negatively affect the rest of the hosts that are already connected?  I'm thinking no.  

My main original question is outside of the tasks/events tab and the ILO log, is there anywhere else on BL460c HP blade I can find some more granulated logs that may tell me how issue occurred in the first place?  

Thank you!
0
 

Author Comment

by:emjay180
ID: 39612609
Looks like it happened again.  I did the instructions hanccoka suggested and it didn't reconnect.  All servers are functioning and I can ILO into the ESXi box.  Any suggestions?  We're going to cold boot tonight, that allowed vCenter to reconnect to the host afterwards last time.  Thanks!

Did these too:  

1. Restart vCenter Server Service.
2. On the ESXi Host - Restart Network Management Agents
0
 
LVL 120

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 400 total points
ID: 39612693
it's time to start looking through the logs

1. vCenter Server logs
2. ESXi logs
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
VMware - Migrating 1 61
VMDK convert to VHD question 3 57
vmware 6.5 and shared storage 2 90
Configuring Cisco EtherChannel on SG350xg switch and VMWare 5 20
If we need to check who deleted a Virtual Machine from our vCenter. Looking this task in logs can be painful and spend lot of time, so the best way to check this is in the vCenter DB. Just connect to vCenter DB(default DB should be VCDB and using…
In this article, I will show you HOW TO: Install VMware Tools for Windows on a VMware Windows virtual machine on a VMware vSphere Hypervisor 6.5 (ESXi 6.5) Host Server, using the VMware Host Client. The virtual machine has Windows Server 2016 instal…
Teach the user how to convert virtaul disk file formats and how to rename virtual machine files on datastores. Open vSphere Web Client: Review VM disk settings: Migrate VM to new datastore with a thick provisioned (lazy zeroed) disk format: Rename a…
This video shows you how easy it is to boot from ISO images for virtual machines with the ISO images stored on a local datastore on the ESXi host.

751 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question