John Myers
asked on
If anyone has experience with VMware HA, the textbook answer to this question does not work.
If anyone has experience with VMware HA, the textbook answer to this question does not work.
Problem: Our customer has a VCenter and ESX hosts in an HA Cluster. Twice encountered was a problem where switches dropped links to the network which leave the switches online (link status positive) but without routes, and the ESX hosts entered into host isolation mode even though all ESX servers were online. Likewise the VMs were unable to communicate on the network because of the network isolation split brain for 15 minutes. The customer asked us to validate their new configuration and make recommendation for best practice.
Scenario 1 The faulty configuration
vSwitch1
Service Console VLAN 1 - 10.1.1.1
Uplinks 2 Active uplinks on VLAN 1 connected to separate physical switches
Switches run Spanning Tree
Scenario 2 Alternative
vSwitch1
Service Console - VLAN 1 - 10.1.1.1
Service Console 2 VLAN 2 - 10.1.2.1
Uplinks 2 Active on VLAN 1 and 2 Standby on VLAN 2
Switches run Spanning Tree and PortFast has been enabled
Is Scenario 2 the best config?
Problem: Our customer has a VCenter and ESX hosts in an HA Cluster. Twice encountered was a problem where switches dropped links to the network which leave the switches online (link status positive) but without routes, and the ESX hosts entered into host isolation mode even though all ESX servers were online. Likewise the VMs were unable to communicate on the network because of the network isolation split brain for 15 minutes. The customer asked us to validate their new configuration and make recommendation for best practice.
Scenario 1 The faulty configuration
vSwitch1
Service Console VLAN 1 - 10.1.1.1
Uplinks 2 Active uplinks on VLAN 1 connected to separate physical switches
Switches run Spanning Tree
Scenario 2 Alternative
vSwitch1
Service Console - VLAN 1 - 10.1.1.1
Service Console 2 VLAN 2 - 10.1.2.1
Uplinks 2 Active on VLAN 1 and 2 Standby on VLAN 2
Switches run Spanning Tree and PortFast has been enabled
Is Scenario 2 the best config?
Most of the time you have isollation issues are due to DNS. How are the hosts added to vCenter? Are you sing IP address or FQDN for the hosts. If using FQDN are the A record on the DNS zone?
Why your switches dropped the links? If this is the main issue dont you think you should fix this first?
During this period, if you connect a PC to the switch or port on specific vlan, can you ping default gateway for each vlan?
Btw, both scenarios are fine depending on the network design and you wont be needing stp on the ports connected the esx host, so enable portfast
During this period, if you connect a PC to the switch or port on specific vlan, can you ping default gateway for each vlan?
Btw, both scenarios are fine depending on the network design and you wont be needing stp on the ports connected the esx host, so enable portfast
Shouldn't STP be disabled for all ports going to ESX servers?
Also what was the reason for the links being dropped?
If you use Scenario 2 ... you would need to configure the das.isolationadress, etc
This is explained here: http://www.yellow-bricks.com/vmware-high-availability-deepdiv/
Also what was the reason for the links being dropped?
If you use Scenario 2 ... you would need to configure the das.isolationadress, etc
This is explained here: http://www.yellow-bricks.com/vmware-high-availability-deepdiv/
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Looks like DNS is the issue. Thank you very much for the quick reply and most important the resolution.
ASKER
Great work
What kind of switches are you using? Are you sharing the SC pipe with any heavy trafic type of connection such as VMotion or ip storage?
Lars