Beacon probing is a configurable network failure detection mechanism used by ESX to identify downstream network failures. The purpose of this article is to explain some of the mystery and clarify a commonly misunderstood subject. The information in this article was gathered through direct observation and discussions with VMware.
As opposed to “Link Status Only,” beacon probing can identify a downstream failure. In the event that a failure does not cause a link-down event or if the link-down state is not forwarded to the ESX host, beacon probing can identify and compensate for network failure. Beacon probing works in conjunction with link status, and link-state-down will still trigger a failover if beacon probing is enabled.
Beacon probing identifies failures by sending out and listening for broadcast packets of a specific type. ESX 4.1 uses ethertype 0x8922, ESX 4.0 uses an 802.3 frame which is displayed in Wireshark as an LLC frame with “BCN (0xFF)” in the control field. ESX 3.5 uses an ethertype of 0x05ff. The probes contain the virtual MAC associated with the physical NIC and the name of the interface.
When virtual switch tagging is used, beacon probing sends one packet per host, per second, per in-use VLAN. Meaning the probes are sent down VLANs on which there are virtual machines (this includes the ESX service console). In the case of two interfaces, each interface will send one probe every other second. A failure is identified if 3 consecutive packets are not received on an uplink. Therefore a total of 6 seconds will pass before an uplink is identified as down. As a side note, the maximum number of probes and VLANs to probe can be configured via vCenter under Software¿advanced settings¿net on the host configuration tab.
The number of packets can be alarming, especially in a large domain with many ESX hosts and multiple VLANs per host. For example, a single host with 2 physical interfaces and 3 VLANs will send 3 packets per second, 180 per minute, and 10,800 per hour. This can quickly add up in a farm domain with several hundred hosts.
VMware recommends using beacon probing for configurations with three or more network interfaces. Three or more interfaces allow ESX to identify the leg that is down. When using only 2 interfaces, beacon probing cannot pinpoint the outage. When a failure occurs in this scenario, ESX will enter “shotgun” mode and send all traffic down both legs.
I hope that this will help dispense with some of the conjecture and mystery surrounding beacon probing. I have not had the opportunity to its behavior with VGT configured, but I expect similar results.