Solved

vmware esxi 5 Host Not Responding

Posted on 2014-07-30
13
644 Views
Last Modified: 2014-08-30
I have had this problem before and it has improved since I replaced the HDDs in the shared storage devices (iSCSI) I am using.  However, today this happened again. I am wondering why I can't ping the host even though I know the host is powered on.  I have to power cycle the host to get it online again.

I am not sure exactly how the heartbeat works but from what I've read, the hosts send out a heartbeat to vcenter server every 10 seconds.  If there is no response in 60 seconds the host drops out of the cluster.

any ideas as to why the host is "not responding" to anything even a ping?
0
Comment
Question by:IKtech
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 6
13 Comments
 
LVL 121
ID: 40230646
If it's the iSCSI issue we've seen the host ESXi OS, gets hung up polling, and polling the iSCSI datastore, this thread takes up all CPU time, resulting in a non-responsive ESXI OS.

What version of ESXi 5.x ? are you using?

Also what SAN (iSCSI) are you using, and is it on the HCL?

Non-compatible iSCSI devices seem to have issues with this, and homebrew SANS
0
 
LVL 3

Author Comment

by:IKtech
ID: 40230663
ESXi 5.0.0 build 768111

QNAP NAS devices.  These devices are sold as VMware compatible, TS-x96 series is on the HCL  I have TS-469 QNAPs

Would the iSCSI issue cause the host to be unresponsive to a ping?  

Maybe I could use NFS datastores instead.
0
 
LVL 121
ID: 40230702
Would the iSCSI issue cause the host to be unresponsive to a ping?

Yes.

ESXi 5.0.0 build 768111 is the GA release of ESXi 5.0, it's been updated many times.

I would update to the latest and last version of ESXi 5.0, which is Update 3.

After checking the VMware HCL for the TS-469 QNAPs, iSCSI is only listed for U2 and U3, not U0, which is what you are using.

NAS (NFS) is listed for 5.0.

I would recommending updating your version of ESXi 5.0 U0 to U3 at least, and get ALL the benefits of the issues and fixes, and certification for your iSCSI SAN.
0
Connect further...control easier

With the ATEN CE624, you can now enjoy a high-quality visual experience powered by HDBaseT technology and the convenience of a single Cat6 cable to transmit uncompressed video with zero latency and multi-streaming for dual-view applications where remote access is required.

 
LVL 3

Author Comment

by:IKtech
ID: 40231874
I am assuming I don't need to upgrade vcenter server, or maybe it is necessary with U3?

Thanks!!
0
 
LVL 121
ID: 40231897
Ideally, you would update both together, vCenter Server first, and then ESXi.
0
 
LVL 3

Author Comment

by:IKtech
ID: 40270428
This has happened again over the weekend.  On july 30 when it happened it was host A that stop responding, over the weekend it was host B that stopped responding.  It seems like whichever host has been up longer has a higher risk of having this issue.

When it happened on the 8/16, I went to our datacenter and I was able to login to the console with no issues.  the first thing I tried was restarting management agents.  They stopped but hung on starting.  I had to hold the power button and do a hard shutdown and reboot to get it going again.

Hopefully upgrading the software will help but I thought this info may provide some clues as to what is going on.  Do you have any thoughts Andrew?  Thanks!!

Maybe I should think about restarting management agents on schedule.  Do you think that may help?
0
 
LVL 121
ID: 40270505
Is you storage stable ?

did this happen before or after upgrade the hang?

Did you update  to a supported version ?
0
 
LVL 3

Author Comment

by:IKtech
ID: 40270512
I haven't upgraded yet...

The storage has no errors or disconnects from the other host.  HA rebooted the VMs on the other host the last time this happened.
0
 
LVL 121
ID: 40270568
Your environment and hosts are *NOT SUPPORTED* on the current version of ESXi.

It's happening because it's NOT SUPPORTED!!!!!!

Therefore it's an untested environment, and therefore anything could happen in production!

This is why the HCL exists, because that is the tested and certified environment.

Again, I would seriously recommend upgrading to a Verified environment by VMware and Qnap.

If you raise a support call with VMware or Qnap, this would be the first item, on their list to advise and check.
0
 
LVL 3

Author Comment

by:IKtech
ID: 40270597
I will upgrade.  No question about that.  

However I just wanted your opinion regarding a scheduled restart of the management services.
0
 
LVL 121
ID: 40270724
Underlying Storage issue, which is not stable, is not going to be cured, by restart of the management services.
0
 
LVL 3

Accepted Solution

by:
IKtech earned 0 total points
ID: 40283726
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2035701

I am using the net adapters mentioned in this article, they are also using an older driver than the one mentioned that "resolves the issue"

This seems to fit my situation and I have updated the drivers.  It's a shame that I haven't found this sooner.  I was focused on storage issues and didn't examine other possibilities.  Shame on me for that...

Thanks for your help.  I will plan to upgrade as well in the near future.
0
 
LVL 3

Author Closing Comment

by:IKtech
ID: 40294244
updated drivers for network adapters.
0

Featured Post

Simplifying Server Workload Migrations

This use case outlines the migration challenges that organizations face and how the Acronis AnyData Engine supports physical-to-physical (P2P), physical-to-virtual (P2V), virtual to physical (V2P), and cross-virtual (V2V) migration scenarios to address these challenges.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this step by step tutorial with screenshots, we will show you HOW TO: Enable SSH Remote Access on a VMware vSphere Hypervisor 6.5 (ESXi 6.5). This is important if you need to enable SSH remote access for additional troubleshooting of the ESXi hos…
In this article, I show you step by step with screenshots to assist you - HOW TO: Deploy and Install the VMware vCenter Server Appliance 6.5 (VCSA 6.5), with some helpful tips along the way.
Teach the user how to install and configure the vCenter Orchestrator virtual appliance Open vSphere Web Client: Deploy vCenter Orchestrator virtual appliance OVA file: Verify vCenter Orchestrator virtual appliance boots successfully: Connect to the …
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

615 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question