Xeronimo
asked on
vSphere HA virtual machine failover failed?
Hi,
I've upgraded my vCenter appliance to 5.5 and now the VMs of one of my cluster servers indicate this error message ... the VMs on the other two cluster servers are just fine. Any ideas what this is about and how to resolve it?
The actual virtual servers run fine though, they're accessible, etc. It's just that there is this error message in the vCenter client.
Thank you!
I've upgraded my vCenter appliance to 5.5 and now the VMs of one of my cluster servers indicate this error message ... the VMs on the other two cluster servers are just fine. Any ideas what this is about and how to resolve it?
The actual virtual servers run fine though, they're accessible, etc. It's just that there is this error message in the vCenter client.
Thank you!
Whats the HA status on the host housing the vm?
ASKER
hanccocka: all the VMs are started!
dipopo: the HA status is 'running (master)'
dipopo: the HA status is 'running (master)'
Yes, this error message occurs on all versions of VMware HA.
e.g. 4.1, 5.0, 5.1 and 5.5
Reconfigure each Host for HA - Reconfigure HA, or Disable Cluster and Re-enable. (eg. Edit Settings on Cluster, and remove Tick!)
e.g. 4.1, 5.0, 5.1 and 5.5
Reconfigure each Host for HA - Reconfigure HA, or Disable Cluster and Re-enable. (eg. Edit Settings on Cluster, and remove Tick!)
ASKER
I've already reconfigured the HA on all the hosts, that didn't remove the error messages. I'll try to disable and reenable the cluster then ...
ASKER
Ok, so now I've disable and re-enabled the cluster.
The error messages on the VMs from that one host are still there though ... is that normal? Should I simply clear and acknowledge them then?
The error messages on the VMs from that one host are still there though ... is that normal? Should I simply clear and acknowledge them then?
Yes, it's normal.
You have checked HA works for you?
You have checked HA works for you?
ASKER
Checked? You mean by shutting down one host to see if the others restart the VMs? No, not yet ... I just didn't get any error messages while disabling and re-enabling the cluster ... is that not enough to be sure? ;)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Yes, remove power is what I meant ... but I can't test that right now ... those servers would be offline for a couple of minutes ...
If with toggling on/off you mean enabling/re-enabling the cluster option: I've done that and the servers still have a red flag next to them?
If with toggling on/off you mean enabling/re-enabling the cluster option: I've done that and the servers still have a red flag next to them?
Could you check to see if an attempt to vmotion a vm from the host throws up errors?
ASKER
I've migrated one of the VMs to a host where no VMs have this HA failure message ... this VM keeps displaying that error though, even on the new machine! I guess the error message is not a 'live' one, it's simply still displayed because of an earlier error?
I think I'll move all my VMs, except for a test VM, from the problematic host to a different host and then power off the first one and restart it. And then let's see what it says ...
I think I'll move all my VMs, except for a test VM, from the problematic host to a different host and then power off the first one and restart it. And then let's see what it says ...
If testing VMware HA - just remove the power to host.
ASKER
I'm, obviously, new to this cluster thing but just to be sure: since the VMs are stored on shared storage, will the other servers in the cluster immediately take over the virtual machines from the 'failed' host? Or do the VMs initially crash and the other servers will onlyx then immediately restart them? The former would be preferable though and make more sense ...
But what if the HA does not work on my problematic server then its VMs will surely crash, right? That's why I thought to migrate my VMs for to a different host and then test the HA with one test VM running on it.
But what if the HA does not work on my problematic server then its VMs will surely crash, right? That's why I thought to migrate my VMs for to a different host and then test the HA with one test VM running on it.
Yes the latter holods true
VMs initially crash and the other servers will only then immediately restart them? - 1-5 minutes.
VMs initially crash and the other servers will only then immediately restart them? - 1-5 minutes.
ASKER
Ok, so I've tested it and the HA works! The test VM got restarted on a different host.
Also, all the VMs indicating that HA failure message have "vSphere HA Protection: Protected'.
So I guess I can move those VMs back to that server and simply clear the alarms?
Also, all the VMs indicating that HA failure message have "vSphere HA Protection: Protected'.
So I guess I can move those VMs back to that server and simply clear the alarms?
Correct. Well Done, Most People do not test!
ASKER
Ok, and thank you for your help and also your compliment! :)
No problems!
Can you check if all VMs are started on all your Hosts.