Solved

Hyper-V Cluster - Failover Corrupts VM Config File

Posted on 2013-05-16
2
546 Views
Last Modified: 2014-11-12
I have a 2-node Hyper-V cluster with a quorum disk that I have been testing various failover scenarios on.  All functions perfectly, except when I perform the following failure simulation.

1. shut down host that is not hosting VM (HOST02)
2. wait approximately 2 minutes for host to be "really down"
3. shut down host that is hosting VM (HOST01)
4. wait for first downed host to come back up
5. connect to cluster to find that the first downed host is not assuming the any roles.  I do have preferences set for each role to prefer HOST01 when it is available, but I didn't think that would keep HOST02 from hosting the role if HOST01 was down.
6. notice that VM being hosted is in a failed state with event IDs 1069 and 21502 filling the log.  What I have deduced is that it cannot locate the .xml config file for the VM at the location it is supposed to be on the CSV that is shared between the hosts.

The .xml file does, in fact, exist in the same location it was before the crash of both hosts.  The VM is not recoverable.   Is the XML file permanently corrupt?  Is this the expected behavior with Hyper-V clusters?  If so, what a horrible design!  I have to end up restoring the XML file from a backup before the VM will start.
0
Comment
Question by:marrj
  • 2
2 Comments
 
LVL 1

Accepted Solution

by:
marrj earned 0 total points
ID: 39175825
After doing more testing, it appears that what is really going on is that the CSV that the VM resides on is not reconnecting to the last host remaining in the cluster after a failure of that host.  This seems to be universally true for both hosts, no matter what order I purposefully fail them in.  The fix is to manually take the CSV offline in the Failover Clustering MMC snapin and manually bring it back online.  The VM will then successfully resume.  So, it looks like my cluster is going to require manual intervention if both nodes go down or restart.

The reason that a Commvault backup restore of the VM's config file brought the VM back online is that it would automatically bring the volume online as part of the restore process. I didn't know that would happen.

So, is there any way to automate this behavior so that my nodes don't require manual intervention after events such as a datacenter power outage.  I do have DR plans to survive such an outage at another site, but I will still ultimately have to turn the cluster back on when operations resume.
0
 
LVL 1

Author Closing Comment

by:marrj
ID: 39192044
No one else answered in a timely manner.
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Add more memory to server from VM 13 31
NTFS Permissions 6 47
LDAP search through mutiple lower OU's 3 27
Patching ESXi Host via PowerCLI 10 18
Will try to explain how to use the VMware feature TAGs in the VMs and create Veeam Backup Jobs using TAGs. Since this article is too long, I will create second article for the Veeam tasks.
Restoring deleted objects in Active Directory has been a standard feature in Active Directory for many years, yet some admins may not know what is available.
This tutorial will walk an individual through the steps necessary to enable the VMware\Hyper-V licensed feature of Backup Exec 2012. In addition, how to add a VMware server and configure a backup job. The first step is to acquire the necessary licen…
How to install and configure Citrix XenApp 6.5 - Part 1. In this video tutorial we have explained step by step installation of Citrix XenApp 6.5 Server on Windows Server 2008 R2 is explained in this video. We have explained the difference between…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question