?
Solved

Hyper-V Cluster - Failover Corrupts VM Config File

Posted on 2013-05-16
2
Medium Priority
?
556 Views
Last Modified: 2014-11-12
I have a 2-node Hyper-V cluster with a quorum disk that I have been testing various failover scenarios on.  All functions perfectly, except when I perform the following failure simulation.

1. shut down host that is not hosting VM (HOST02)
2. wait approximately 2 minutes for host to be "really down"
3. shut down host that is hosting VM (HOST01)
4. wait for first downed host to come back up
5. connect to cluster to find that the first downed host is not assuming the any roles.  I do have preferences set for each role to prefer HOST01 when it is available, but I didn't think that would keep HOST02 from hosting the role if HOST01 was down.
6. notice that VM being hosted is in a failed state with event IDs 1069 and 21502 filling the log.  What I have deduced is that it cannot locate the .xml config file for the VM at the location it is supposed to be on the CSV that is shared between the hosts.

The .xml file does, in fact, exist in the same location it was before the crash of both hosts.  The VM is not recoverable.   Is the XML file permanently corrupt?  Is this the expected behavior with Hyper-V clusters?  If so, what a horrible design!  I have to end up restoring the XML file from a backup before the VM will start.
0
Comment
Question by:marrj
  • 2
2 Comments
 
LVL 1

Accepted Solution

by:
marrj earned 0 total points
ID: 39175825
After doing more testing, it appears that what is really going on is that the CSV that the VM resides on is not reconnecting to the last host remaining in the cluster after a failure of that host.  This seems to be universally true for both hosts, no matter what order I purposefully fail them in.  The fix is to manually take the CSV offline in the Failover Clustering MMC snapin and manually bring it back online.  The VM will then successfully resume.  So, it looks like my cluster is going to require manual intervention if both nodes go down or restart.

The reason that a Commvault backup restore of the VM's config file brought the VM back online is that it would automatically bring the volume online as part of the restore process. I didn't know that would happen.

So, is there any way to automate this behavior so that my nodes don't require manual intervention after events such as a datacenter power outage.  I do have DR plans to survive such an outage at another site, but I will still ultimately have to turn the cluster back on when operations resume.
0
 
LVL 1

Author Closing Comment

by:marrj
ID: 39192044
No one else answered in a timely manner.
0

Featured Post

What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ransomware is a malware that is again in the list of security  concerns. Not only for companies, but also for Government security and  even at personal use. IT departments should be aware and have the right  knowledge to how to fight it.
Previously, on our Nano Server Deployment series, we've created a new nano server image and deployed it on a physical server in part 2. Now we will go through configuration.
This tutorial will show how to configure a new Backup Exec 2012 server and move an existing database to that server with the use of the BEUtility. Install Backup Exec 2012 on the new server and apply all of the latest hotfixes and service packs. The…
There are cases when e.g. an IT administrator wants to have full access and view into selected mailboxes on Exchange server, directly from his own email account in Outlook or Outlook Web Access. This proves useful when for example administrator want…
Suggested Courses
Course of the Month15 days, 1 hour left to enroll

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question