Solved

ESX 5.0 host disconnected from vCenter 5.0, troubleshooting and RCA

Posted on 2013-10-24
6
826 Views
Last Modified: 2014-01-14
The other night a host disconnected from our vCenter server.  I was able to RDP to the host and the 4 VMs below.  It was rebooted via ILO and then became completely inaccessible.  After pulling out and reseating the blade  BL460c, I was able to reconnect the host from vCenter and the VMs again became accessible.  Is there good CLI commands to do a root cause analysis?  And/or anywhere other than ILO and the Tasks&Events tab to get troubleshooting information?  

Rebooting the CIM service and enabling SSH on the host is being blamed and I think disconnection issue was hardware related.  So far, I looked at right click and "report performance..." and can only see when the host was disconnected.  Also, same thing in the tasks&events, I see when it lost connection "host is not responding" but that's it.  The other spot I looked was the management log in the ILO (HP ILO2), and only found: POST Error: 1794-Drive Array - Array Accelerator Battery Charge Low.  Date was after issue happened.
0
Comment
Question by:emjay180
  • 3
  • 2
6 Comments
 
LVL 1

Expert Comment

by:Avinash21
ID: 39598446
Hello when you say the Host was able keep the VM's running. and you were able to RDP .? which kind of point us to the direction that the host Itself was fine and Just the Management agents would have stopped responding or crashed down.

This could happen if you have had any storage related problem. May be you might have hit an APD situation.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684


Thanks,
Avinash
0
 

Author Comment

by:emjay180
ID: 39598554
Basically, on vCenter, the host and the 4 VMs that were residing on that host showed up as disconnected on vCenter browser.  But, the host and VMs themselves were accessible via remote console/RDP.  Then, another member of team did and ILO console to the host and rebooted it.  At that point, nothing was accessible ILO/RDP along with still displaying as disconnected in vCenter.  

Only reseating the blade and powering it back on brought the host back online from a vCenter perspective.

Also, I suppose I should mention, the other 11 ESX hosts were fine and accessible from vCenter while this was going on.  

I did also notice the local datastore for that host was inaccessbile, but that would probably be expected.  

Thanks!
0
 
LVL 119
ID: 39598652
if this happenes again:-

1. Restart vCenter Server Service.

2. On the ESXi Host - Restart Network Management Agents

both of the above will not affect the running VMs on the host.
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:emjay180
ID: 39600791
Thank you for tips on how to remedy situation next time.  If I restart vCenter service, would that negatively affect the rest of the hosts that are already connected?  I'm thinking no.  

My main original question is outside of the tasks/events tab and the ILO log, is there anywhere else on BL460c HP blade I can find some more granulated logs that may tell me how issue occurred in the first place?  

Thank you!
0
 

Author Comment

by:emjay180
ID: 39612609
Looks like it happened again.  I did the instructions hanccoka suggested and it didn't reconnect.  All servers are functioning and I can ILO into the ESXi box.  Any suggestions?  We're going to cold boot tonight, that allowed vCenter to reconnect to the host afterwards last time.  Thanks!

Did these too:  

1. Restart vCenter Server Service.
2. On the ESXi Host - Restart Network Management Agents
0
 
LVL 119

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 400 total points
ID: 39612693
it's time to start looking through the logs

1. vCenter Server logs
2. ESXi logs
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
vCenter 6.5 install woes 2 74
backup strategy concern 5 77
Microsoft Virtual Machine Converter Access Question 3 56
host cache 3 25
HOW TO: Connect to the VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere (HTML5 Web) Host Client 6.5, and perform a simple configuration task of adding a new VMFS 6 datastore.
When rebooting a vCenters 6.0 and try to connect using vSphere Client we get this issue "Invalid URL: The hostname could not parsed." When we get this error we need to do some changes in the vCenter advanced settings to fix the issue.
Teach the user how to edit .vmx files to add advanced configuration options Open vSphere Web Client: Edit Settings for a VM: Choose VM Options -> Advanced: Add Configuration Parameters:
Advanced tutorial on how to run the esxtop command to capture a batch file in csv format in order to export the file and use it for performance analysis. He demonstrates how to download the file using a vSphere web client (or vSphere client) and exp…

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now