• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 603
  • Last Modified:

esxi VM moved to different clustered host in power off condition, why?

This is the same 2-node esxi 5.5 hosts setup for HA in my previous posted questions. 2 Volumes were made available from iscsi san storage. both esxi hosts can simultaneously access to both volumes. In volume 1, 2 VMs are stored, which 3 VMs are stored in volume 2.

Using vCenter, a cluster is formed to take care of the above 2 hosts and 5 VMs. I also configured the 2 VMs that stored in volume 1, hosted by esxi host 1; whereas, other 3 VMs hosted by esxi host 2.

Now, in order to test the cluster, I shut down the esxi host 1, I can see that 2 VMs from volume 1, are not hosted by esxi host 2, but, both VMs are in the power off state. Can I set in such the way that these VMs, while migrating, is in the power on state?

thanks in advance.
0
MichaelBalack
Asked:
MichaelBalack
  • 13
  • 10
  • 4
3 Solutions
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
VMware HA - restarts VMs on a Host Failure.

otherwise, you will need to migrate the VMs off the host before shutting down!

Shutting down a Host is not a host failure, because it's controlled.

Just pull out the power cable to simulate a Host Failure.
0
 
MichaelBalackAuthor Commented:
Okay, will try it tomorrow while onsite
0
 
Steve MCommented:
In case you don't like pulling the power cord on your server, you can also pull the Network cables, or disable the switch ports they are connected to initiate an HA failover (as long as host monitoring is enabled in HA).
0
Get expert help—faster!

Need expert help—fast? Use the Help Bell for personalized assistance getting answers to your important questions.

 
MichaelBalackAuthor Commented:
Hi isk-ck,

Tried pull out the power cable, and all VMs failover with system reboot. Does this behaviour normal?

There are 3 nic in nic teaming configured in the same vswitch fot all vm. I pulled out alk 3cables, and there wasn't failover, why? The vm wasn't powered off
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
CORRECT - VMware HA, RESTARTS the VMs on other HOSTS!

Completely Normal for VMware HA

They are not rebooted, they have failed......because the host has failed, so they are restarted on new hosts!
0
 
MichaelBalackAuthor Commented:
Hi Hanccocka,

That means pulling out the cables doesn't trigger a failover?
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
We normally test by pulling out the power.

VMs should then restart on other hosts.
0
 
MichaelBalackAuthor Commented:
Hi Hanccocka,

How about if all the network ports/cables have  to take as well?
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Not quite sure I understand "How about if all the network ports/cables have  to take as well? "
0
 
MichaelBalackAuthor Commented:
Hi Hanccocka,

That means if all the nic for vm are detected offline, can ha take care and trigger a failover?
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
see here for testing

VMware KB: Simulating VMware High Availability failover

There are several tests:-

Host Failure and  a Network Isolation type of failure, what you simulated was a  Network Isolation type of failure, e.g. network fault, and if working, the VMs should have been restarted on another server.
0
 
Steve MCommented:
HA should absolutely take care of a network failure (wouldn't be very highly available if it didn't) - if you have redundant nics it should fail over internally on the nics, but if all the nics went offline then it should bring the guests on that host up on another host.

Do you run your vcenter server as a guest on one of the hosts or is it a separate physical server?
0
 
MichaelBalackAuthor Commented:
Hi isk-ck,

Vcenter ran as a vm on one of the host. Ever tested that it was able yo failover to another host on power failure.
0
 
Steve MCommented:
If I understand correctly, when vcenter is a vm on a host, if it is on the same host that you pulled the nic cables from, then it would likely still be able to communicate with the one host and vm's on that same vSwitch, so likely a failover would not be initiated. I've never actually tested that, but it would make sense.

Is your vcenter guest on the same host that you unplugged the nics?
0
 
MichaelBalackAuthor Commented:
Hi isk-ck,

I actually tried to pull cable of the host where vcenter wasn't located.
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
VMware HA is conducted by the HA Agents on the Host Servers. vCenter Server is only used to configure VMware HA.

e.g. if the host fails, and vCenter Server is a VM, HA Agents control restarting....
0
 
Steve MCommented:
Ah thanks Hanccocka, I hoped it would be that way.

MichaelBalack; Using the vSphere WebClient, have you looked at the vSphere HA runtime information page?
(located by selecting your cluster, then monitor tab, then vSphere HA tab)

This page should show you if everything is configured - how many hosts are connected to the master, who is the master host, and what datastores are used for heartbeat, etc.

Maybe that will show something.
0
 
MichaelBalackAuthor Commented:
Hi both,

Thanks for showing all the guidelines.

Hanccocka pointed a very good guide - vCenter is only used for configuring HA, it doesn't need to make ha work.

I suspect the problem lies on the vSwitches. vSwitch0 had configured for VM, Management; vSwitch1 had configured for IP Storage (iSCSI), Storage heartbeat, and vMotion. I think I should configure vMotion on vSwitch0 instead of current vSwitch1.

Please see few of my corrective works to be done:

        1. Configure correct IP in Software > DNS and routing; as both hosts are added-in in
            IP addresses. Put in the correct FQDNs in internal DNS server
        2. on vSwitch0, no default gateway is configured. I will configure it to point to switch
        3. Move vMotion VMkernal port group to vSwicth0
        4. Review Cluster settings for Host monitoring, and VM monitoring
   
As Isk-ck pointed out, there is no reason the ha failover didn't occurred for the following tests:

        a. pull the power cable of the host
        b. disconnect the network cables
        c. off the all NICs

I did checked through the cluster summary, not abnormality found.
0
 
MichaelBalackAuthor Commented:
May be I should share out the current setup on the networking - 2 network segments, one is network for VMs (production) and Management (host), using 172.16.100.0/24. The second one is for Storage - 172.16.0.0/24. 2 vSwitches are created, each targetted at one segment.

    vSwitch0:     vmkernel port group for management, no default gateway defined
                         vm port group for production
                         * 3 NICs bound

    vSwitch1:    vmkernel port group storage (iSCSI 1)
                        vmkernel port group storage (iSCSI 2)
                        vmkernle port group (Storage Heartbeat)
                        VMkernel port group for vmotion and Management, got default gateway
                        * 2 NICs bound
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Can you upload screenshots of networking?

Are your default gateways reachable or the isolation address.

Networking DNS default gateways all have to be correct eg DNS resolution and reverse DNS
0
 
MichaelBalackAuthor Commented:
Hi hanccocka,

Please see the networking screenshots as attached.

On vSwitch0, default gateway/isolation address is not reachable or hence not defined
on vSwitch1, default gateway defined and pingeable.

DNS resolution and reverse DNs for  2 hosts? not defined

Would these be the root cause?
Networkings.docx
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The default gateway is that which is defined on your management interface

Eg ip address and hostname of host.
Can this be pinged from all hosts?
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Isolation address does not need to be the default gateway but usually is or can be any man interface which is reachable 24/7 but then must be specified also not having working DNS will not help
0
 
MichaelBalackAuthor Commented:
Hi hanccocka,

Okay, I will put those needed settings when i am onsite tomorrow. Will update you guys about the progress...
0
 
MichaelBalackAuthor Commented:
Hi hanccocka,

2 main changes I did: create a new VMkernel for vmotion on vSwitch0; and secondly, configure the DNS hosts and related IPs, and thirdly, change the default gateway.

Now, the testing on plugging off all the network cables, can triggered a host isolation, and subsequent a failover occurred.
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Very good, it's often network configuration which causes HA to fail, and not failover!

Glad it's fixed.
0
 
MichaelBalackAuthor Commented:
Thanks a lot on both experts, that provided details info/leads to eventually got the problem resolved.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Microsoft Exchange Server

The MCTS: Microsoft Exchange Server 2010 certification validates your skills in supporting the maintenance and administration of the Exchange servers in an enterprise environment. Learn everything you need to know with this course.

  • 13
  • 10
  • 4
Tackle projects and never again get stuck behind a technical roadblock.
Join Now