Solved

Unable to configure HA

Posted on 2010-09-13
11
4,116 Views
Last Modified: 2012-05-10
Hi I am unable to configure HA on one particular host

There are 3 hosts in my environment. esxsvr1, esxsvr2, esxsvr3 . HA is configuring properly on esxsvr2 and esxsvr3 but it uis always failing to configure on esxsvr1

Uninstalled all the agents and disconnected and re-added the host back to cluster, still unable to configure HA.

All the DNS setting are perfect.
0
Comment
Question by:Sandu_vmware
  • 2
  • 2
  • 2
  • +4
11 Comments
 
LVL 28

Expert Comment

by:bgoering
ID: 33666025
I have had that issue before and what has worked for me is to unconfigure, then reconfigure HA on the cluster - then all the hosts came online.
0
 
LVL 57

Expert Comment

by:Pete Long
ID: 33666823
also check the hostname entries on esxsvr1  - they are CASE sensitive http://www.petenetlive.com/KB/Article/0000276.htm


Pete
www.petenetlive.com
0
 
LVL 16

Expert Comment

by:danm66
ID: 33669377
to follow on with what the others have said...

on each host run the following commands and make sure the output is correct:
hostname
hostname -s
hostname -i
cat /etc/hosts (make sure that 127.0.0.1 localhost entry is there, as well as the correct hostname/IP)
ping vcname
ping vcname.domain.name
ping esxsvrX
ping esxsvrX.domain.name (replace X with the other two hosts number)
df -h (make sure plenty of disk space for file systems especially / and /var/log)
esxcfg-vswitch -l (make sure that VM portgroup names are all the same from host to host)
esxcfg-vswif -l (make sure that all hosts have same subnet mask and are on same subnet)

If none of that is the cause, let me know at what percentage it fails when your try to reconfigure for HA.
0
Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

 
LVL 25

Expert Comment

by:madunix
ID: 33669426
Are you able to connect to each host via DNS?
Check the /etc/host and /etc/resolv.conf files to ensure that ....
0
 
LVL 28

Expert Comment

by:bgoering
ID: 33672922
Another thing to look at is amount of memory allocated to the service console. Sometimes when running DRS and HA the default just isn't enough. To check go to the vSphere client, select your esx host, go to configuration tab, and click link for memory. I set mine to the max of 800 MB, if it isn't at 800 click the properties link to change it. A system restart will be required to implement any change.

Good Luck
0
 

Expert Comment

by:rccg94
ID: 33673519
Hi,

Not sure if this applies to you but one thing that has stung me in the past with setting up HA is the default gateway settings.  Generally, we do not route our ESXi host IPs - they are on a private subnet/network.  If the host cannot PING the default gateway address, HA will not complete.  It does not have to be able to actually get out, it just needs the ping to respond.  So if you have no gateway defined on that particular host (or a dummy one, like 10.10.10.1 , etc), set the gateway to an address that will actually reply and see what you get.

Good luck!
0
 
LVL 25

Expert Comment

by:madunix
ID: 33674182
i had also the same issue with gateway ..... as said above check the gateway
0
 
LVL 3

Expert Comment

by:michelkeus
ID: 33682596
Indeed if you check your gateway and it does not allow for ICMP ECHO requests then you might be helped by setting the das.isolationaddress to an IP that does allow ICMP ECHO requests. If you set das.isolationaddress it allows for an extra check.

In VC inventory section right click on cluster >> edit settings >> VMware HA >> Click the button that says Advanced Options >> select an empty box and manually type das.isolationaddress and put the IP address in as a value.
0
 

Author Comment

by:Sandu_vmware
ID: 33687868
Hey thank you for all your quick respond guys.

Below is the error message that I observed in /var/log/vmware/aam/aam_config_util_addnode.log
FULLTIME_SITES_TID 00000005
+ 1:8042,8042,8043 esxsvr1    vmware #FT_Agent_Port=8045
+ 2:8042,8042,8043 esxsvr2 vmware
+ 3:8042,8042,8043 exssvr3 vmware
09/14/10 14:05:32 [vpxa_respond        ] VMwareerrortext=Internal AAM Error - agent could not start.
09/14/10 14:05:32 [vpxa_respond        ] VMwareerrorcat=internalerror
09/14/10 14:05:32 [myexit              ] copying /etc/opt/vmware/aam/vmware-sites to /var/log/vmware/aam/aam_config_util_addnode.log
FULLTIME_SITES_TID 00000005
+ 1:8042,8042,8043 esxsvr1    vmware #/
+ 2:8042,8042,8043 esxsvr2 vmware
+ 3:8042,8042,8043 exssvr3 vmware
09/14/10 14:05:32 [myexit              ] Failure location:
09/14/10 14:05:32 [myexit              ]        function main::myexit called from line 2306
09/14/10 14:05:32 [myexit              ]        function main::start_agent called from line 1238
09/14/10 14:05:32 [myexit              ]        function main::add_aam_node called from line 210
09/14/10 14:05:32 [myexit              ] VMwareresult=failure
09/14/10 14:05:32 [elapsed_time        ] Total time for script to complete:  6 minute(s) and 17 second(s)



Any update on this please ??
0
 

Accepted Solution

by:
Sandu_vmware earned 0 total points
ID: 33722063
Hello Friends,

I got HA to be configured perfectly fine.

There was actually a wrong entry in /etc/hosts file for esxsvr3

It is registered as exssvr3, I found it by using the wireshark tool, although I corrected the hosts file entry the issue still persists.

So I changed the hostname on kernel level so that all the files are corrected in esxsvr3

#sysctl -w kernel.hostname=<NEW FQDN NAME>

and rebooted the host.

Hurray, The HA configuration went perfectly fine now.

0
 
LVL 16

Expert Comment

by:danm66
ID: 33722392
Had you followed my advice in comment #3, I believe you would have found the cause and saved yourself some time.
0

Featured Post

Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If we need to check who deleted a Virtual Machine from our vCenter. Looking this task in logs can be painful and spend lot of time, so the best way to check this is in the vCenter DB. Just connect to vCenter DB(default DB should be VCDB and using…
When converting a physical machine to a virtual machine using VMware vCenter Converter Standalone or vCenter Converter Enterprise, if an adapter type is not selected during the initial customization the resulting virtual machine may contain an IDE d…
Teach the user how to rename, unmount, delete and upgrade VMFS datastores. Open vSphere Web Client: Rename VMFS and NFS datastores: Upgrade VMFS-3 volume to VMFS-5: Unmount VMFS datastore: Delete a VMFS datastore:
Teach the user how to join ESXi hosts to Active Directory domains Open vSphere Client: Join ESXi host to AD domain: Verify ESXi computer account in AD: Configure permissions for domain user in ESXi: Test domain user login to ESXi host:

785 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question