Solved

Vmware vSphere HA configuration

Posted on 2009-07-15
22
4,543 Views
Last Modified: 2012-05-07
Have two ESX 4.0 servers (identical HP DL380 G5) connected to shared SAS datastore.  Have three VM configed.  Created a cluster and turned on HA.  Everything is reporting as fine (able to ping hosts, verified DNS, no errors).  VMotion works fine and I can migrate VM from one machine to the other.  If I test HA (unplug NICs) - the VM's do not migrate and restart as expected.  Have walked thru every HA guide I can find (created HA enabled cluster first and then added hosts to it).   The only thing that I see is that at the point the server goes off-line vCenter records "HA agent has an error: HA agent has failed" - this is at the point that I would expect it to migrate.  Any ideas?
0
Comment
Question by:TPolk
  • 8
  • 8
  • 3
  • +2
22 Comments
 
LVL 19

Expert Comment

by:vmwarun - Arun
Comment Utility
What setting have you configured for Host Isolation response ?

0
 
LVL 32

Expert Comment

by:nappy_d
Comment Utility
Have you condigured your guests to startup on another host?  

Do you have enough RAM to support all your guests running on one host?

What is the constraint setting for your HA cluster?
0
 
LVL 24

Expert Comment

by:ryder0707
Comment Utility
by the way, this is not a new issue, had happened since 3.x

you can try to disjoine all hosts & recreate the cluster then all ESX/VC server must have their hosts file updated to include the below entries

- Loopback, always 127.0.0.1 localhost.localdomain localhost
- Local Server IP, FQDN, shortname
- Local Server console IP and <hostname>-cons
- Local Server VMotion IP Address, <hostname>-vmotion
- VirtualCentre Server IP Address. FQDN, shortname
- IP Address and DNS for all hosts in the same HA/DRS configuration

and ensure below is the standard settings in HA cluster(this is standard in environment i usually support)

Number of host failures the cluster can tolerate: 1
Allow VMs to be powered on even if they violate availability constraints: Enabled
VM Restart Prioirty: Low
Host Isolation response: Leave VM powered on
Enable Virtual machine monitoring: Not enabled

good luck!
0
 

Author Comment

by:TPolk
Comment Utility
The machines are set to "leave powered on", don't see where to configure VM to start on another host settings - will try the host file edit and see what the results are..
0
 
LVL 32

Expert Comment

by:nappy_d
Comment Utility
Check you settings on the properties of your HA cluster... It should look like the images below.

Picture-1.png
Picture-2.png
0
 

Author Comment

by:TPolk
Comment Utility
verified HOSTS file settings, created new cluster and set HA up on it with:

Number of host failures the cluster can tolerate: 1 <cannot set this with setting below>
Allow VMs to be powered on even if they violate availability constraints: Enabled
VM Restart Prioirty: Low
Host Isolation response: Leave VM powered on
Enable Virtual machine monitoring: Not enabled

rebooted host (without placing it in maintenance mode) and VM did NOT restart on other host.  Other ideas?  Any good location to determine why it isn't working?  (support log, etc).  
0
 

Author Comment

by:TPolk
Comment Utility
When the host came back-up the VM did restart (but it waited until the host was back online).  We have moved VM's around with Vmotion and that works fine.
0
 
LVL 32

Expert Comment

by:nappy_d
Comment Utility
What messages are in your logs? Do you have any exclamation marks appearing in your VI client for the ESX Hosts?

Look at your event logs...
0
 
LVL 32

Expert Comment

by:nappy_d
Comment Utility
Also try enabling virtual machine monitoring for HA.. Set it to low and test again.
0
 

Author Comment

by:TPolk
Comment Utility
Nothing shows up as in error - (other than note that we don't have redundant managment NIC) - the only thing that shows up is at the point of failure (Host is off-line) - there is a message that says "HA agent has error: HA agent has failed" - any particualr log to look in?  We have tried VM monitoring both on and off but no difference...  
0
 
LVL 32

Expert Comment

by:nappy_d
Comment Utility
Anything more regarding that error message HA agent has error: HA agent has failed is that the full and complete error message?
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 32

Expert Comment

by:nappy_d
Comment Utility
Try these steps http://www.no-x.org/?p=155
0
 

Author Comment

by:TPolk
Comment Utility
nothing more than that error message -

Steps referenced didn't want to work (we have ESXi - so no full service console) but found a similar link using uninstall scripts -

(from the tech support console)

The scripts can be found in /opt/vmware/uninstallers.
To get there:

#cd /opt/vmware/uninstallers

Get a directory listing
#ls
-rwxr-xr-x 1 root root 857 VMware-aam-ha-uninstall.sh
-rwxr-xr-x 1 root root 434 -vpxa-uninstall.sh

To run these scripts,

./VMware-aam-ha-uninstall.sh
./VMware-vpxa-uninstall.sh

The agents are now removed, so re-do the HA config for the cluster

After this steps - resetup HA and retested but same result...
0
 
LVL 32

Expert Comment

by:nappy_d
Comment Utility
have you purchase vCenter?  If so, this does come with some support from VMWare..
0
 
LVL 24

Expert Comment

by:ryder0707
Comment Utility
probably now is the time to engage vmware support
0
 
LVL 32

Expert Comment

by:nappy_d
Comment Utility
I concur.  As I have previously mentioned, you had purchased HA with some version of vCenter.  If you have done so in the pas 30 days, you are afforded some technical suport.
0
 

Author Comment

by:TPolk
Comment Utility
Yes we have Vmware support and I think it is time to engage them - I'll update after we resolve (maybe we missed something)
0
 
LVL 19

Expert Comment

by:vmwarun - Arun
Comment Utility
Have you resolved the HA Issue ?
0
 
LVL 24

Expert Comment

by:ryder0707
Comment Utility
yeah curious to know what is the actual problem
0
 

Author Comment

by:TPolk
Comment Utility
Currently at level 3 VMWare support - they think it is something environmental but no answer yet...
0
 

Expert Comment

by:shankarvetrivel
Comment Utility
The only thing that I see is that at the point the server goes off-line vCenter records "HA agent has an error: HA agent has failed" - this is at the point that I would expect it to migrate.  Any ideas?
When u configure HA cluster,esx inside cluster will be sending an heart beart to each esx servers,if agent heart beat is not responding for more than 15 secs,that particular host will be declared as 'Failed host or isolated from network'.
Please make sure your esx is reaching service console gateway.
 
Apologise If my answers are silly.
 
Thanks
 
 
0
 

Accepted Solution

by:
TPolk earned 0 total points
Comment Utility
Okay - here is the offical answer from VMWare - There is a bug in the software and HA will not work if you have it on a public internal address.   Theses devices are on a 9.19.x.x network (sorry - don't ask - didn't design it)....  
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

One of the new features of a version 7.0 or later virtual machine, supported in VMware vSphere 4.1, 5.0 or the VMware vSphere Hypervisor ESXi 4.1, ESXi 5.0 often overlooked by VMware Administrators is the ability to add and connect USB devices conne…
HOW TO: Connect to the VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere (HTML5 Web) Host Client 6.5, and perform a simple configuration task of adding a new VMFS 6 datastore.
Teach the user how to configure vSphere Replication and how to protect and recover VMs Open vSphere Web Client: Verify vsphere Replication is enabled: Enable vSphere Replication for a virtual machine: Verify replicated VM is created: Recover replica…
Teach the user how to install and configure the vCenter Orchestrator virtual appliance Open vSphere Web Client: Deploy vCenter Orchestrator virtual appliance OVA file: Verify vCenter Orchestrator virtual appliance boots successfully: Connect to the …

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now