Hyper-V R2 Live Migration Fails after network interfaces confgurations became 'scrambled'

Posted on 2010-09-16
Last Modified: 2013-11-06
I am running a two server failover cluster on Windows Server 2008R2 datacenter edition to support HAVM's. I am running 2 Dell R710's connected to a Dell MD3000I ISCSI Chassis.  The ISCSI is configured in multipath and the ISCSI console shows the connectivity is good.

Each host server has 8 network interfaces.  3 of which are dedicated for Hyper-V, 1 is shared between Hyper-V and host management (as a backup), 1 is configured for dedicated host management, two configured for the ISCSI connectivity and the last for the failover cluster.

Here is a logical breakdown of the network adaptors, they are the same on both servers (OB= Onboard -Broadcom / AI = Add In card - Intel)

OBE1 - ISCSI01 - ISCSI Traffic only - IPV4 only - dedicated (unique) subnet 01
OBE2 - failover Cluster - cluster traffic only - IPV4 - MS Client and File & Print services  - Cluster communications - No client communications - dedicated (unique) subnet 02
OBE3 - Main network lan - dedicated interface - IPV4 - IPV6 - MS Client - F&P
OBE4 - HVNet01 - dedicated to Hyper-V virtual guest networking - External no OS management
AI01 - ISCSI02 - ISCSI traffic only - IPV4 only - dedicated (unique) subnet 03
AI02 - HVNet02 - dedicated to Hyper-V virtual guest networking - External no OS management
AI03 - HVNet03 - dedicated to Hyper-V virtual guest networking - External no OS management
AI04 - HVNet02 / backup host management - Hyper-V virtual guest networking - with Host management - IPV4 - MS Client - F&P

I was able to Live Migrate a week ago, all was working.  Sometime in the past week my network cards on each host became ‘scrambled’  What I mean by that is the configuration for OBE2 ended up on AI03, the configuration for AI03 ended up on Ai02, the interface name of OBE1 (a broadcom interface) was now linked to AI04 (Intel add in card) … each of the host servers were scrambled differently.

After a few hours of painstakingly correcting the networking configurations (I had to pull the network cable from each port in turn so I could track them down), then going back and correcting the Hyper-V networking links under Hyper-V manager I am here.

When I attempt to live migrate a server it fails.  It appears the configuration files move but the CSV doesn’t.  I can migrate the CSV without issue, but the Hyper-V guest does not migrate.

The validation says the configuration is suitable for clustering on both servers.

 I am getting a barrage of event ID’s 1196, 1579, 5120, 5142.

I cannot seem to track this down, any help would be appreciated.  And when this gets fixed how do we prevent it again, it seems odd the network interface card could get scrambled in such a way.


 The error received when trying to Migrate using VMM is:

Error (10698)

Virtual machine SP01 could not be live migrated to virtual machine host VHOST01 using this cluster configuration.

(Unspecified error (0x80004005))

Recommended Action

Check the cluster configuration and then try the operation again.

The error received when trying to migrate using failover Cluster Manager is:

Migration Attempt failed

event 21502

'SCVMM CPC-SP01' live migration did not succeed at the destination.

Live migration did not succeed: The operation was canceled.

As I said I can see it try to move the VM configuration but it never takes the CSV offline or tries to move it.
Question by:Curtis McCallister
  • 2

Accepted Solution

Curtis McCallister earned 0 total points
ID: 33702216
I have corrected the problem by going through the cluster networks and changing all the cluster settings to each of the nics, then changing them back.  My theory is when the NICS became 'scrambled' (still looking for someone to say they have seen that before) something became corrupt in the system.  Don't ask me where but it is the only thing that makes sense.  And when I changed the configuration and saved it, then changed it back and saved it again it must have corrected the problem.  My next step was going to be taking all of the virtual guests down and rebuilding the cluster from scratch.
LVL 15

Expert Comment

ID: 33702738
Just out of curiousity, did you Refresh the configuration of the VM. When you change the settings of the Highly available VM (host clustered VM), they are not automatically updated to the Failover Cluster Manager (FCM). That is because the Hyper-V Manager in not cluster-aware. Thus there becomes a lag between the HAVM configuration cached in the FCM and the configuration held in the XML file used by Hyper-V Manager. Refreshing the VM Configuration solves this issue.
I believe that is what happened to you. You corrected the Hyper-V Network Configuration, you also changed the VM configuration of the HA-VM, which didn't reflect in the FCM. You might have restarted the VM, which might have renewed it configuration for the XML, thus correcting the issue. Hopefully this might help someone in the future.


Author Comment

by:Curtis McCallister
ID: 33703517
I refreshed my configuration using SCVMM.  I also restarted both virtual host servers and all the virtual guests during my troubleshooting (after correcting the basic network configuraitons).
Do you have any thoughts on how the network configuration might of gotten damaged in the first place? This environment has been running for a few months before this event.

Featured Post

Guide to Performance: Optimization & Monitoring

Nowadays, monitoring is a mixture of tools, systems, and codes—making it a very complex process. And with this complexity, comes variables for failure. Get DZone’s new Guide to Performance to learn how to proactively find these variables and solve them before a disruption occurs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Is your company's data protection keeping pace with virtualization? Here are 7 dynamic ways to adapt to rapid breakthroughs in technology.
Giving access to ESXi shell console is always an issue for IT departments to other Teams, or Projects. We need to find a way so that teams can use ESXTOP for their POCs, or tests without giving them the access to ESXi host shell console with a root …
This tutorial will walk an individual through setting the global and backup job media overwrite and protection periods in Backup Exec 2012. Log onto the Backup Exec Central Administration Server. Examine the services. If all or most of them are stop…
In this video tutorial I show you the main steps to install and configure  a VMware ESXi6.0 server. The video has my comments as text on the screen and you can pause anytime when needed. Hope this will be helpful. Verify that your hardware and BIO…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question