We help IT Professionals succeed at work.

Expected downtime of VM in VMWare cluster after simulating H/W failure.

tp-it-team
tp-it-team asked
on
Hi guys - what the topic says.
I'm preparing my VM env for production and that's one of the things I'm trying.

Without using Fault Tolerance, it takes 15-20 pings for the VM to automatically start on a different host.

I have a pretty beefy setup with 4 powerful hosts and Compellent so if the above is not right, I would suspect a wrong config rather than H/W bottleneck.

I'm still new to VM, please try to keep it simple.

Thanks
Comment
Watch Question

Systems Engineer (Acting IT Manager)
Commented:
Remember if the host "fails" and the VM fails-over to a different host the VM still has to power on, on that host..
Post, Boot Operating System, ext..

Having a delay is normal as their is no active state the server resumes from.

DirkMare

Author

Commented:
Sure, but... I believe it was something like 4-5 pings when it was demo'ed to me... But I can be wrong, I don't remember exactly. Shutting down that VM and powering it back on the same host actually takes much shorter time than failover.
Dirk MareSystems Engineer (Acting IT Manager)

Commented:
open up the console of the VM server and simulate fail over it could (if its Windows) have startup selection that counts down from 30 seconds to boot into Recovery because of a dirty shutdown..

DirkMare
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017
Commented:
1 to 2 minutes waiting for restart is Good metric.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
4-5 pings is vMotion and that's slow!

vMotion and VMware HA us different if you require faster fail over or HA look at FT or Fail over Cluster or other in VM HA and replication.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
also remember it takes time for HA to notice the host is down.

so how long after you have killed your host does it take BEFORE it attempts to start VM protected by HA
specialist MohamedSupport Engineer
Commented:
Agree with Andrew on this.
If we are talking about vMotion, we might lose hardly half a dozen pings and not more than that for a good working setup.

There are conditions that HA should check for before it reboots the VM's on a specific host.
By the time it checks heartbeats from datastores and checks for Network Isolation response and decides to reboot, it will be a few pings (as you calculate).
Then the VM has to be registered to a different host. If there are stale locks held by the host that went down, then it takes a bit longer for HA to trigger the re-register process successfully.
The VM's boot operation time taken should also be considered in case of HA.

Note: HA "reboots" the VM on a different host.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
HA does not reboot VM on other hosts, it's a COLD START-UP.

e.g. Power-Up.
Dirk MareSystems Engineer (Acting IT Manager)

Commented:
More then enough information given to answer authers question.

DirkMare