I have a 2-node Hyper-V cluster with a quorum disk that I have been testing various failover scenarios on. All functions perfectly, except when I perform the following failure simulation.
1. shut down host that is not hosting VM (HOST02)
2. wait approximately 2 minutes for host to be "really down"
3. shut down host that is hosting VM (HOST01)
4. wait for first downed host to come back up
5. connect to cluster to find that the first downed host is not assuming the any roles. I do have preferences set for each role to prefer HOST01 when it is available, but I didn't think that would keep HOST02 from hosting the role if HOST01 was down.
6. notice that VM being hosted is in a failed state with event IDs 1069 and 21502 filling the log. What I have deduced is that it cannot locate the .xml config file for the VM at the location it is supposed to be on the CSV that is shared between the hosts.
The .xml file does, in fact, exist in the same location it was before the crash of both hosts. The VM is not recoverable. Is the XML file permanently corrupt? Is this the expected behavior with Hyper-V clusters? If so, what a horrible design! I have to end up restoring the XML file from a backup before the VM will start.