winshuttle
asked on
Hyper-V clustering - VMs rebooting after node failure
I have a 2-node Hyper-V R2 cluster.
Today, I had all the VMs running on node1, so I could do maintenance on node2. As I was doing some upgrades on node 2, the server BSOD'd and rebooted.
Right after this happened, all the clusted VMs that were running on node 1 rebooted themselves.
After checking the logs, it appears that they lost their storage momentarily, that I think that's what caused them to reboot.
Afterwards, I noticed that the quorom disk was shown to be failed. It was being hosted on Node2. I had to manually reconnect the iSCSI LUN then I was able to bring it online.
My question is why did a node failure of Node 2 cause all clustered VMs on node 1 to reboot?
Here's one of the event log entries. There was one for each VM:
The Virtual Machines configuration ED2F9165-5316-4856-8C50-F9 E93B6912E0 at 'C:\ClusterStorage\Volume1 \Server1' is no longer accessible: Invalid handle (0x80070006)
Then each machine has an entry like this:
'Virtual Machine Configuration Server1' successfully registered the configuration for the virtual machine.
'Virtual Machine Server1' successfully started the virtual machine.
I also noticed that the quorom disk for the cluster was/is hosted by Node2.
Today, I had all the VMs running on node1, so I could do maintenance on node2. As I was doing some upgrades on node 2, the server BSOD'd and rebooted.
Right after this happened, all the clusted VMs that were running on node 1 rebooted themselves.
After checking the logs, it appears that they lost their storage momentarily, that I think that's what caused them to reboot.
Afterwards, I noticed that the quorom disk was shown to be failed. It was being hosted on Node2. I had to manually reconnect the iSCSI LUN then I was able to bring it online.
My question is why did a node failure of Node 2 cause all clustered VMs on node 1 to reboot?
Here's one of the event log entries. There was one for each VM:
The Virtual Machines configuration ED2F9165-5316-4856-8C50-F9
Then each machine has an entry like this:
'Virtual Machine Configuration Server1' successfully registered the configuration for the virtual machine.
'Virtual Machine Server1' successfully started the virtual machine.
I also noticed that the quorom disk for the cluster was/is hosted by Node2.
ASKER
Far as I know, all my storage is clustered. Both servers connect to my SAN, and they are using the same named mount points or cluster shared volume.
I've found the moment I stop cluster services on node 2, I'm unable to manage the cluster and all running VMs fail.
I've found the moment I stop cluster services on node 2, I'm unable to manage the cluster and all running VMs fail.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
This was the only solution.
I am not real experiences with HyperV as I do Mostly VM but I have seen that exactly happen in VM when the link dropped for storage.