[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

Hyper-V Cluster - Failover Corrupts VM Config File

Posted on 2013-05-16
2
Medium Priority
?
554 Views
Last Modified: 2014-11-12
I have a 2-node Hyper-V cluster with a quorum disk that I have been testing various failover scenarios on.  All functions perfectly, except when I perform the following failure simulation.

1. shut down host that is not hosting VM (HOST02)
2. wait approximately 2 minutes for host to be "really down"
3. shut down host that is hosting VM (HOST01)
4. wait for first downed host to come back up
5. connect to cluster to find that the first downed host is not assuming the any roles.  I do have preferences set for each role to prefer HOST01 when it is available, but I didn't think that would keep HOST02 from hosting the role if HOST01 was down.
6. notice that VM being hosted is in a failed state with event IDs 1069 and 21502 filling the log.  What I have deduced is that it cannot locate the .xml config file for the VM at the location it is supposed to be on the CSV that is shared between the hosts.

The .xml file does, in fact, exist in the same location it was before the crash of both hosts.  The VM is not recoverable.   Is the XML file permanently corrupt?  Is this the expected behavior with Hyper-V clusters?  If so, what a horrible design!  I have to end up restoring the XML file from a backup before the VM will start.
0
Comment
Question by:marrj
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
2 Comments
 
LVL 1

Accepted Solution

by:
marrj earned 0 total points
ID: 39175825
After doing more testing, it appears that what is really going on is that the CSV that the VM resides on is not reconnecting to the last host remaining in the cluster after a failure of that host.  This seems to be universally true for both hosts, no matter what order I purposefully fail them in.  The fix is to manually take the CSV offline in the Failover Clustering MMC snapin and manually bring it back online.  The VM will then successfully resume.  So, it looks like my cluster is going to require manual intervention if both nodes go down or restart.

The reason that a Commvault backup restore of the VM's config file brought the VM back online is that it would automatically bring the volume online as part of the restore process. I didn't know that would happen.

So, is there any way to automate this behavior so that my nodes don't require manual intervention after events such as a datacenter power outage.  I do have DR plans to survive such an outage at another site, but I will still ultimately have to turn the cluster back on when operations resume.
0
 
LVL 1

Author Closing Comment

by:marrj
ID: 39192044
No one else answered in a timely manner.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article, I am going to show you how to simulate a multi-site Lab environment on a single Hyper-V host. I use this method successfully in my own lab to simulate three fully routed global AD Sites on a Windows 10 Hyper-V host.
Ransomware is a malware that is again in the list of security  concerns. Not only for companies, but also for Government security and  even at personal use. IT departments should be aware and have the right  knowledge to how to fight it.
This tutorial will walk an individual through locating and launching the BEUtility application and how to execute it on the appropriate database. Log onto the server running the Backup Exec database. In a larger environment, this would generally be …
This tutorial will walk an individual through locating and launching the BEUtility application to properly change the service account username and\or password in situation where it may be necessary or where the password has been inadvertently change…

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question