Solved

ESX 5.0 host disconnected from vCenter 5.0, troubleshooting and RCA

Posted on 2013-10-24
6
817 Views
Last Modified: 2014-01-14
The other night a host disconnected from our vCenter server.  I was able to RDP to the host and the 4 VMs below.  It was rebooted via ILO and then became completely inaccessible.  After pulling out and reseating the blade  BL460c, I was able to reconnect the host from vCenter and the VMs again became accessible.  Is there good CLI commands to do a root cause analysis?  And/or anywhere other than ILO and the Tasks&Events tab to get troubleshooting information?  

Rebooting the CIM service and enabling SSH on the host is being blamed and I think disconnection issue was hardware related.  So far, I looked at right click and "report performance..." and can only see when the host was disconnected.  Also, same thing in the tasks&events, I see when it lost connection "host is not responding" but that's it.  The other spot I looked was the management log in the ILO (HP ILO2), and only found: POST Error: 1794-Drive Array - Array Accelerator Battery Charge Low.  Date was after issue happened.
0
Comment
Question by:emjay180
  • 3
  • 2
6 Comments
 
LVL 1

Expert Comment

by:Avinash21
ID: 39598446
Hello when you say the Host was able keep the VM's running. and you were able to RDP .? which kind of point us to the direction that the host Itself was fine and Just the Management agents would have stopped responding or crashed down.

This could happen if you have had any storage related problem. May be you might have hit an APD situation.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684


Thanks,
Avinash
0
 

Author Comment

by:emjay180
ID: 39598554
Basically, on vCenter, the host and the 4 VMs that were residing on that host showed up as disconnected on vCenter browser.  But, the host and VMs themselves were accessible via remote console/RDP.  Then, another member of team did and ILO console to the host and rebooted it.  At that point, nothing was accessible ILO/RDP along with still displaying as disconnected in vCenter.  

Only reseating the blade and powering it back on brought the host back online from a vCenter perspective.

Also, I suppose I should mention, the other 11 ESX hosts were fine and accessible from vCenter while this was going on.  

I did also notice the local datastore for that host was inaccessbile, but that would probably be expected.  

Thanks!
0
 
LVL 118
ID: 39598652
if this happenes again:-

1. Restart vCenter Server Service.

2. On the ESXi Host - Restart Network Management Agents

both of the above will not affect the running VMs on the host.
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 

Author Comment

by:emjay180
ID: 39600791
Thank you for tips on how to remedy situation next time.  If I restart vCenter service, would that negatively affect the rest of the hosts that are already connected?  I'm thinking no.  

My main original question is outside of the tasks/events tab and the ILO log, is there anywhere else on BL460c HP blade I can find some more granulated logs that may tell me how issue occurred in the first place?  

Thank you!
0
 

Author Comment

by:emjay180
ID: 39612609
Looks like it happened again.  I did the instructions hanccoka suggested and it didn't reconnect.  All servers are functioning and I can ILO into the ESXi box.  Any suggestions?  We're going to cold boot tonight, that allowed vCenter to reconnect to the host afterwards last time.  Thanks!

Did these too:  

1. Restart vCenter Server Service.
2. On the ESXi Host - Restart Network Management Agents
0
 
LVL 118

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE) earned 400 total points
ID: 39612693
it's time to start looking through the logs

1. vCenter Server logs
2. ESXi logs
0

Featured Post

Free Gift Card with Acronis Backup Purchase!

Backup any data in any location: local and remote systems, physical and virtual servers, private and public clouds, Macs and PCs, tablets and mobile devices, & more! For limited time only, buy any Acronis backup products and get a FREE Amazon/Best Buy gift card worth up to $200!

Join & Write a Comment

Data center, now-a-days, is referred as the home of all the advanced technologies. In-fact, most of the businesses are now establishing their entire organizational structure around the IT capabilities.
Will try to explain how to use the VMware feature TAGs in the VMs and create Veeam Backup Jobs using TAGs. Since this article is too long, I will create second article for the Veeam tasks.
Teach the user how to install ESXi 5.5 and configure the management network System Requirements: ESXi Installation:  Management Network Configuration: Management Network Testing:
Teach the user how to install and configure the vCenter Orchestrator virtual appliance Open vSphere Web Client: Deploy vCenter Orchestrator virtual appliance OVA file: Verify vCenter Orchestrator virtual appliance boots successfully: Connect to the …

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now