Link to home
Start Free TrialLog in
Avatar of John Myers
John MyersFlag for United States of America

asked on

If anyone has experience with VMware HA, the textbook answer to this question does not work.

If anyone has experience with VMware HA, the textbook answer to this question does not work.

Problem: Our customer has a VCenter and ESX hosts in an HA Cluster. Twice encountered was a problem where switches dropped links to the network which leave the switches online (link status positive) but without routes, and the ESX hosts entered into host isolation mode even though all ESX servers were online. Likewise the VMs were unable to communicate on the network because of the network isolation split brain for 15 minutes. The customer asked us to validate their new configuration and make recommendation for best practice.

Scenario 1  The faulty configuration
vSwitch1
                Service Console  VLAN 1 - 10.1.1.1
                Uplinks  2 Active uplinks on VLAN 1 connected to separate physical switches
                Switches run Spanning Tree

Scenario 2  Alternative
vSwitch1
                Service Console - VLAN 1  - 10.1.1.1
                Service Console 2  VLAN 2  - 10.1.2.1
                Uplinks  2 Active on VLAN 1 and 2 Standby on VLAN 2
                Switches run Spanning Tree and PortFast has been enabled

Is Scenario 2 the best config?
Avatar of larstr
larstr
Flag of Norway image

Scenario 1 should be fine, but you should also enable portfast for reliable performance.

What kind of switches are you using? Are you sharing the SC pipe with any heavy trafic type of connection such as VMotion or ip storage?

Lars
Avatar of Paul Solovyovsky
Most of the time you have isollation issues are due to DNS.  How are the hosts added to vCenter?  Are you sing IP address or FQDN for the hosts.  If using FQDN are the A record on the DNS zone?
Why your switches dropped the links? If this is the main issue dont you think you should fix this first?
During this period, if you connect a PC to the switch or port on specific vlan, can you ping default gateway for each vlan?
Btw, both scenarios are fine depending on the network design and you wont be needing stp on the ports connected the esx host, so enable portfast
Shouldn't STP be disabled for all ports going to ESX servers? 
Also what was the reason for the links being dropped?
If you use Scenario 2 ... you would need to configure the das.isolationadress, etc
This is explained here: http://www.yellow-bricks.com/vmware-high-availability-deepdiv/


ASKER CERTIFIED SOLUTION
Avatar of larstr
larstr
Flag of Norway image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of John Myers

ASKER

Looks like DNS is the issue.  Thank you very much for the quick reply and most important the resolution.
Great work