Solved

Hyper-V Failover Cluster doesn't failover when nic unplugged

Posted on 2014-02-25
5
768 Views
Last Modified: 2014-11-12
Hi,
  I have a hyper-v failover cluster running with the following config:

SERVERS
NAS -  Windows Server 2012 with a bunch of drives acting as a NAS
VMHOST1 - Windows Server 2012 with a few VM's
VMHOST2 - Windows Server 2012 with a few VM's

NETWORK
10.0.0.X Corp network
192.168.1.X storage network
192.168.150.X hearbeat network

Using iSCSI targets to the NAS for VM storage.  Failover validation passes.  VM's all up and running fine, if I simulate failover by doing  alive migration it works perfectly.  If I simulate a failover by stopping the cluster service on one of the VMHOST's it works perfectly.  I had an issue today that another tech was in the server room and dislodged the corp network cable to VMHOST2.  I would have expected the failover to kick in and the VM's to move to VMHOST1.  This did not happen.  Instead my phone blew up that all the VM's on VMHOST2 were down.

I attempted to access VMHOST2 through the corp network and saw it was down.  was able to RDP in through storage network.  See that corp nic shows unplugged.  Had tech reseat the cable and all connectivity restored.

My question is why didn't it failover thinking the nic failed if the cable was unplugged?  How can I test this further and what am I missing in my setup that it passes validation and simulated tests but not a real world failure?

Thanks in advance.
0
Comment
Question by:compcreate
  • 2
  • 2
5 Comments
 
LVL 38

Expert Comment

by:Adam Brown
ID: 39886970
Are you using a file share witness that both servers can access as the third vote in the cluster? If one of the servers can't access the file witness share, it's vote will always count as a shutdown for the cluster.
0
 
LVL 35

Expert Comment

by:Mahesh
ID: 39886971
In all storage, VM and name and Ip resources properties you must define possible owner so that if connectivity losses any time resources can migrate to another host hopefully

Because when you do live migration \ stop cluster service, you are forcing manually VMs to switch over, hence it is working

Mahesh
0
 

Author Comment

by:compcreate
ID: 39887590
Not sure how to reply to your questions.  I have a quorum disk setup.  I setup things based on a MS doc on how to setup a failover cluster with two hosts and one shared iSCSI storage.

The directions included creating a quorum that would be the deciding factor since there is only 2 hosts.

I think I may have found the problem.  According to a doc I found online, a NEW feature in 2012 R2 (I am only running 2012 not R2) it states that the vm nic works independent of the physical nic even though you tell it to piggy back on the physical.  So in versions PRIOR to 2012R2 you had to use nic teaming to mitigate this issue.  In 2012R2 they added a feature in the virtual nic config called "Protected Network" which essentiall binds the states of the virtual nic to the state of the physical allowing the failover to be initiated when the physical nic goes down.

Can anyone confirm this thought process?
0
 
LVL 38

Accepted Solution

by:
Adam Brown earned 500 total points
ID: 39887707
Well, if you have 2 hosts, you have to configure what is called a File Share Witness. In any situation where you have an even number of hosts in a cluster, you have to have something that is able to provide a third vote. All systems in the cluster communicate with one other periodically (this is the Heartbeat). All the members of the cluster are referred to as a quorum, and each system has a "vote" on whether the cluster remains operating. In order for the cluster to remain operating more than half the nodes in the cluster must vote "yes" (any system that has a vote assigned and is able to send a heartbeat is considered as voting yes). This means that all clusters have to have an odd number of nodes. With two servers, you have to be able to add another node. That's what the file share witness is for.

The file share witness is basically just a network shared folder on a computer that is not one of the members of the cluster. If you have the file share witness pointing to a file share that is on one of the host's in the cluster, as soon as that system goes down it counts as two no votes. So in a situation where the cluster has two systems and a file share witness, if the witness share is pointing to one of the nodes, the cluster will fail if that node shuts down entirely. I suspect that you may have configured your file share witness to point to a share on one of the cluster nodes, rather than a computer somewhere else. I say that because you can shut down the cluster services and a lot of other stuff on a cluster node that has the FSW on it and it will still be an operable cluster because the FSW is still accessible from the other node in the cluster. But as soon as that server looses network connectivity or powers off, the cluster fails because it and the FSW are no longer accessible and their votes switch to no.
0
 

Author Comment

by:compcreate
ID: 39887834
Very Interesting... but I would have to say I should be good then.  I have the two hosts in the cluster, and then I setup another iSCSI target that was called quorum (by following the docs) and that resides on the NAS (a third system not part of the cluster).  When I look at the failover cluster manager under storage I see my 1GB quorum disk and next to it, it states "assigned to" disk witness in quorum.  

So that sounds exactly as you are describing.  So I feel better about that part.

Back to the issue of the nic not causing the failover.  Can anyone confirm this is by design (or lack thereof) in 2012 and earlier and nic teaming is the only solution or upgrade to R2 and use "Protected Network"???

Thanks
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

VM backup deduplication is a method of reducing the amount of storage space needed to save VM backups. In most organizations, VMs contain many duplicate copies of data, such as VMs deployed from the same template, VMs with the same OS, or VMs that h…
Restoring deleted objects in Active Directory has been a standard feature in Active Directory for many years, yet some admins may not know what is available.
In this Micro Tutorial viewers will learn how to use Windows Server Backup to create full image of their system. Tutorial shows how to install Windows Server Backup Feature on Windows 2012R2 and how to configure scheduled Bare Metal Recovery backup.…
This tutorial will walk an individual through the process of installing of Data Protection Manager on a server running Windows Server 2012 R2, including the prerequisites. Microsoft .Net 3.5 is required. To install this feature, go to Server Manager…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now