Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

ESX 5.0 host disconnected from vCenter 5.0, troubleshooting and RCA

Posted on 2013-10-24
6
Medium Priority
?
881 Views
Last Modified: 2014-01-14
The other night a host disconnected from our vCenter server.  I was able to RDP to the host and the 4 VMs below.  It was rebooted via ILO and then became completely inaccessible.  After pulling out and reseating the blade  BL460c, I was able to reconnect the host from vCenter and the VMs again became accessible.  Is there good CLI commands to do a root cause analysis?  And/or anywhere other than ILO and the Tasks&Events tab to get troubleshooting information?  

Rebooting the CIM service and enabling SSH on the host is being blamed and I think disconnection issue was hardware related.  So far, I looked at right click and "report performance..." and can only see when the host was disconnected.  Also, same thing in the tasks&events, I see when it lost connection "host is not responding" but that's it.  The other spot I looked was the management log in the ILO (HP ILO2), and only found: POST Error: 1794-Drive Array - Array Accelerator Battery Charge Low.  Date was after issue happened.
0
Comment
Question by:emjay180
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
6 Comments
 
LVL 1

Expert Comment

by:Avinash21
ID: 39598446
Hello when you say the Host was able keep the VM's running. and you were able to RDP .? which kind of point us to the direction that the host Itself was fine and Just the Management agents would have stopped responding or crashed down.

This could happen if you have had any storage related problem. May be you might have hit an APD situation.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2004684


Thanks,
Avinash
0
 

Author Comment

by:emjay180
ID: 39598554
Basically, on vCenter, the host and the 4 VMs that were residing on that host showed up as disconnected on vCenter browser.  But, the host and VMs themselves were accessible via remote console/RDP.  Then, another member of team did and ILO console to the host and rebooted it.  At that point, nothing was accessible ILO/RDP along with still displaying as disconnected in vCenter.  

Only reseating the blade and powering it back on brought the host back online from a vCenter perspective.

Also, I suppose I should mention, the other 11 ESX hosts were fine and accessible from vCenter while this was going on.  

I did also notice the local datastore for that host was inaccessbile, but that would probably be expected.  

Thanks!
0
 
LVL 123
ID: 39598652
if this happenes again:-

1. Restart vCenter Server Service.

2. On the ESXi Host - Restart Network Management Agents

both of the above will not affect the running VMs on the host.
0
Plug and play, no additional software required!

The ATEN UE3310 USB3.1 Gen1 Extender Cable allows users to extend the distance between the computer and USB devices up to 10 m (33 ft). The UE3310 is a high-quality, cost-effective solution for professional environments such as hospitals, factories and business facilities.

 

Author Comment

by:emjay180
ID: 39600791
Thank you for tips on how to remedy situation next time.  If I restart vCenter service, would that negatively affect the rest of the hosts that are already connected?  I'm thinking no.  

My main original question is outside of the tasks/events tab and the ILO log, is there anywhere else on BL460c HP blade I can find some more granulated logs that may tell me how issue occurred in the first place?  

Thank you!
0
 

Author Comment

by:emjay180
ID: 39612609
Looks like it happened again.  I did the instructions hanccoka suggested and it didn't reconnect.  All servers are functioning and I can ILO into the ESXi box.  Any suggestions?  We're going to cold boot tonight, that allowed vCenter to reconnect to the host afterwards last time.  Thanks!

Did these too:  

1. Restart vCenter Server Service.
2. On the ESXi Host - Restart Network Management Agents
0
 
LVL 123

Accepted Solution

by:
Andrew Hancock (VMware vExpert / EE MVE^2) earned 1600 total points
ID: 39612693
it's time to start looking through the logs

1. vCenter Server logs
2. ESXi logs
0

Featured Post

Turn your laptop into a mobile console!

The CV211 Laptop USB Console Adapter provides a direct Laptop-to-Computer connection for fast and easy remote desktop access with no software to install.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will learn how to backup a VMware farm using Nakivo Backup & Replication. In this tutorial we will install the software on a Windows 2012 R2 Server.
New style of hardware planning for Microsoft Exchange server.
Teach the user how to convert virtaul disk file formats and how to rename virtual machine files on datastores. Open vSphere Web Client: Review VM disk settings: Migrate VM to new datastore with a thick provisioned (lazy zeroed) disk format: Rename a…
This video shows you how to use a vSphere client to connect to your ESX host as the root user. Demonstrates the basic connection of bypassing certification set up. Demonstrates how to access the traditional view to begin managing your virtual mac…

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question