• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 5710
  • Last Modified:

ESXi 4.1 HA agent in cluster has an error: cannot complete HA configuration

Hello,

I to am having an issue with one of my ESX hosts. I am running 2 physical servers running ESXi 4.1. They both were disconnected yesterday, but today ESX 2 came up.

My problem still relies with 1. We can't v Motion servers over since they are disconnected, and we can't restart it preferably because we would go through an email and phone outage.

Above the summary, I get the error: HA Agent in cluster has an error: cannot complete the HA configuration.

I've tried restarting the management services through the console. I have also tried unchecking HA on the cluster, and re enabling it. This has been happening for at least 2 months that I know of now, but it has never been down this long. Any more ideas on what I can do before rebooting the server?
0
Joshua_M
Asked:
Joshua_M
  • 3
  • 2
  • 2
  • +2
2 Solutions
 
Paul SolovyovskySenior IT AdvisorCommented:
Sounds like a dns issue make sure you can ping vcenter and hosts via fqdn from the ssh cli on each host and vcenter to the hosts
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Have you tried moving the host out of the cluster?

Disabling HA?

Disconnected host servers is usually cure by restarting network management agents on the console.

0
 
Danny McDanielClinical Systems AnalystCommented:
If one or both of your servers are reporting that they are disconnected, then that should be your immediate focus...HA won't work if the hosts aren't connecting normally.  

- Can you connect directly to the disconnected hosts with the vsphere client?
- Have you noticed any storage problems?  
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
quankenyuCommented:
There are 2 main reasons cause the problem
1. DNS name issue
2.vcenter build version is not match the ESXi build version

Please following these steps to fix the problem, it's my own notes, fixes all the 3 causes
Failed to enable HA agent on an ESXi host
1.Try to disable HA on Cluster level, by right click ‘clusterX’, edit setting. Uncheck HA and DRS.
2.Waiting for disabling HA tasks completed for all hosts.
3.Back to cluster “edit setting”, check only ‘enable HA’, click ok. It takes a few minutes to get all ESXi hosts on cluster have HA agent installed.
4.Then enable DRS.
5.If one or more ESXi host HA agent is not started. Right click the failed host, “reconfigure for HA agent”
6.If the problem still not get fixed. Disconnect the ESXi host, link through SSH, run the following to uninstall HA agent on host
7./opt/vmware/uninstallers/VMware-vpxa-uninstall.sh  & /opt/vmware/uninstallers/VMware-aam-ha-uninstall.sh , restart vm management services by /sbin/services.sh –restart
8.Then reconnect ESXi on vcente, vcenter will push a new version of HAagent installation.
0
 
quankenyuCommented:
you don't need to reboot it, just disconnect the ESXi host and run command from SSH session (by SSH client such as  putty.exe)
./opt/vmware/uninstallers/VMware-vpxa-uninstall.sh
 /opt/vmware/uninstallers/VMware-aam-ha-uninstall.sh
and
 /sbin/services.sh –restart
to restart vm mgmt services
during the Host disconnection period, all your VM guest will be showing as "disconnected", but they are still running just fine-- you can RDP on those disconnected vm guest with no issues.

---  I believe the cause on my case was, I updated ESXi to 4.1U1 while my vcenter was 4.1U0. After I updated my vcenter to 4.1U1, I had to run those command line on ESXi host on 5 out of 8 of my ESXi hosts to fix the problem
0
 
Joshua_MAuthor Commented:
Negative, all the servers showing disconnected on the problem host I cannot do anything with in v center. The only way to connect to these is via RDP.

Quankenyu,

I will try your steps and see how that works out.
0
 
Danny McDanielClinical Systems AnalystCommented:
If you can't connect directly to the host with the vsphere client, then quankenyu's steps won't help.  They only uninstall the components for the vcenter agent and HA, then restart the mgmt agent.  The mgmt agent is what the vsphere client needs running so it can connect to the host.  

do you see any messages on the console when you go into troubleshooting mode?
0
 
Paul SolovyovskySenior IT AdvisorCommented:
Check the licenses for vCenter and hsots
0
 
quankenyuCommented:
if you can still ping the ESXi host, please verify the build version of your vcenter and ESXi host -- if the version don't match, espicially ESXi build is newer than vcenter (happens ESXi host has been upgraded to U1 but vcenter is still on U0) -- I see this happened on my test lab before.
You may install newer version viclient on another workstation, and try to directly connect to the ESXi host by root, and see how is going
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

  • 3
  • 2
  • 2
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now