Solved

Hyper-V Cluster - Failover Corrupts VM Config File

Posted on 2013-05-16
2
544 Views
Last Modified: 2014-11-12
I have a 2-node Hyper-V cluster with a quorum disk that I have been testing various failover scenarios on.  All functions perfectly, except when I perform the following failure simulation.

1. shut down host that is not hosting VM (HOST02)
2. wait approximately 2 minutes for host to be "really down"
3. shut down host that is hosting VM (HOST01)
4. wait for first downed host to come back up
5. connect to cluster to find that the first downed host is not assuming the any roles.  I do have preferences set for each role to prefer HOST01 when it is available, but I didn't think that would keep HOST02 from hosting the role if HOST01 was down.
6. notice that VM being hosted is in a failed state with event IDs 1069 and 21502 filling the log.  What I have deduced is that it cannot locate the .xml config file for the VM at the location it is supposed to be on the CSV that is shared between the hosts.

The .xml file does, in fact, exist in the same location it was before the crash of both hosts.  The VM is not recoverable.   Is the XML file permanently corrupt?  Is this the expected behavior with Hyper-V clusters?  If so, what a horrible design!  I have to end up restoring the XML file from a backup before the VM will start.
0
Comment
Question by:marrj
  • 2
2 Comments
 
LVL 1

Accepted Solution

by:
marrj earned 0 total points
ID: 39175825
After doing more testing, it appears that what is really going on is that the CSV that the VM resides on is not reconnecting to the last host remaining in the cluster after a failure of that host.  This seems to be universally true for both hosts, no matter what order I purposefully fail them in.  The fix is to manually take the CSV offline in the Failover Clustering MMC snapin and manually bring it back online.  The VM will then successfully resume.  So, it looks like my cluster is going to require manual intervention if both nodes go down or restart.

The reason that a Commvault backup restore of the VM's config file brought the VM back online is that it would automatically bring the volume online as part of the restore process. I didn't know that would happen.

So, is there any way to automate this behavior so that my nodes don't require manual intervention after events such as a datacenter power outage.  I do have DR plans to survive such an outage at another site, but I will still ultimately have to turn the cluster back on when operations resume.
0
 
LVL 1

Author Closing Comment

by:marrj
ID: 39192044
No one else answered in a timely manner.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

#Citrix #Citrix Netscaler #HTTP Compression #Load Balance
A safe way to clean winsxs folder from your windows server 2008 R2 editions
This tutorial will give a short introduction and overview of Backup Exec 2012 and how to navigate and perform basic functions. Click on the Backup Exec button in the upper left corner. From here, are global settings for the application such as conne…
This tutorial will show how to configure a single USB drive with a separate folder for each day of the week. This will allow each of the backups to be kept separate preventing the previous day’s backup from being overwritten. The USB drive must be s…

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now