Hyper-V clustering - VMs rebooting after node failure

winshuttle
winshuttle used Ask the Experts™
on
I have a 2-node Hyper-V R2 cluster.
Today, I had all the VMs running on node1, so I could do maintenance on node2.  As I was doing some upgrades on node 2, the server BSOD'd and rebooted.
Right after this happened, all the clusted VMs that were running on node 1 rebooted themselves.
After checking the logs, it appears that they lost their storage momentarily, that I think that's what caused them to reboot.

Afterwards, I noticed that the quorom disk was shown to be failed.  It was being hosted on Node2.  I had to manually reconnect the iSCSI LUN then I was able to bring it online.

My question is why did a node failure of Node 2 cause all clustered VMs on node 1 to reboot?
Here's one of the event log entries.  There was one for each VM:
The Virtual Machines configuration ED2F9165-5316-4856-8C50-F9E93B6912E0 at 'C:\ClusterStorage\Volume1\Server1' is no longer accessible: Invalid handle (0x80070006)

Then each machine has an entry like this:
'Virtual Machine Configuration Server1' successfully registered the configuration for the virtual machine.
'Virtual Machine Server1' successfully started the virtual machine.

I also noticed that the quorom disk for the cluster was/is hosted by Node2.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
My guess is because your node 2 is hosting all storage.  you need to cluster your storage.  if node 2 goes down it takes storage with it.  

I am not real experiences with HyperV as I do Mostly VM but I have seen that exactly happen in VM when the link dropped for storage.

Author

Commented:
Far as I know, all my storage is clustered.  Both servers connect to my SAN, and they are using the same named mount points or cluster shared volume.

I've found the moment I stop cluster services on node 2, I'm unable to manage the cluster and all running VMs fail.
I believe there was a problem with the cluster itself.  I have destroyed the rebuilt the cluster.
I am now able to reboot each node separately and all running VMs are not affected.

Author

Commented:
This was the only solution.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial