• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 339
  • Last Modified:

High Availability and failover

i would like to know what is the time interval allowed between heartbeats initiated by ESX host before the failover kicks in.

In other words, if ESX hosts has not heard from on of the other ESX hosts in the cluster, for a certain period of time, then they can declare it down, and start rebooting the VMs residing on the Defunct Host , from other hosts

I also want to know if there is a network outage where one ESX host is located on, or 2 of the ESX hosts are located on, would this initiate reboot of VMs on the other hosts.
I know this is very rare, because there is switch redundancy, but it can happen..

Thanks
0
jskfan
Asked:
jskfan
  • 4
  • 4
4 Solutions
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
i would like to know what is the time interval allowed between heartbeats initiated by ESX host before the failover kicks in.

In other words, if ESX hosts has not heard from on of the other ESX hosts in the cluster, for a certain period of time, then they can declare it down, and start rebooting the VMs residing on the Defunct Host , from other hosts

The Time Interval is 10 seconds, these values can be changed, but the defaults are recommeded by VMware.

I also want to know if there is a network outage where one ESX host is located on, or 2 of the ESX hosts are located on, would this initiate reboot of VMs on the other hosts.
I know this is very rare, because there is switch redundancy, but it can happen..

Yes, this can happen, because VMware HA, and the ESXi servers, are checking each other, and checking they can reach the default gateway.
0
 
jskfanAuthor Commented:
10 seconds , that sounds too short… this can cause reboot of VMs, that's what I believe……..
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
If you networking is that poor.

Is your networking and physical switches likely to be unavailale for 10 seconds?
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
jskfanAuthor Commented:
I got the following paragraph from Vmware:

=========
Network Isolation Addresses
A network isolation address is an IP address that is pinged to determine whether a host is isolated from the network. This address is pinged only when a host has stopped receiving heartbeats from all other hosts in the cluster. If a host can ping its network isolation address, the host is not network isolated, and the other hosts in the cluster have failed. However, if the host cannot ping its isolation address, it is likely that the host has become isolated from the network and no failover action is taken.
By default, the network isolation address is the default gateway for the host. Only one default gateway is specified, regardless of how many management networks have been defined. You should use the das.isolationaddress[...] advanced attribute to add isolation addresses for additional networks. See vSphere HA Advanced Attributes.
=================

what I do not understand is, when the host can OR cannot ping its default gateway , what would happen?
 I know most of environment do not dedicate a networked isolation address, since the default gateway address can be used..
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
If the gateway is not reachable, a worker process is started to determin to start HA failover...

VMware HA starts the process of deciding whether to start the VMware HA process.

e.g. a workflow procedure is started, eg. it starts writing to datastores, it checks if all Hosts in the Cluster are contactable, it does not just say, oh, cannot ping the gateway, therefore, I must now failover!!!!

different isolation addresses are used, normally the default gateway is used, because it shoudl always be available in your network.
0
 
jskfanAuthor Commented:
<<However, if the host cannot ping its isolation address, it is likely that the host has become isolated from the network and no failover action is taken.>>>
if you read the above excerpt from vmware, the way they stated it , is no failover action is taken when the host cannot ping the isolation address.
however the way I understand is the failover  will indeed take action when the host cannot ping the isolation address (assuming we are using DG only)
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Many intervals and timings are used to determine if and when to initiate a VMware HA failover.

(and you've got Host Failure and VM failure and restart)

The HA agents on the servers, also check they can reach Master and Slave HA Agents (FDM Agents).
0
 
jskfanAuthor Commented:
Thank you
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 4
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now