Solved

Hyper-V vNIC not responding

Posted on 2015-01-08
10
514 Views
Last Modified: 2015-01-13
Hello guys, I have a problem that Ive been pulling my hair out for quite some time.

What we have setup is a Windows Failover Cluster setup across a number of blade servers. Each server has Windows Server 2012R2 installed on it and the Hyper-V and failover clustering roles installed. Each server has 4 physical network interfaces, two HP NC373i Integrated and two HP NC373m Mezzanine. I only have 2 of 4 physical NICs connected to our switch at this time they are switch independently teamed through Windows. On top of this we have eight virtual NICs connected to a Hyper-V virtual switch:

Access/Management, Cluster network, Migration network, Replica network, and four SMB data transfer networks (for accessing VHDs on a storage server)

We have each virtual NIC on a separate VLAN and they all have statically assigned IP addresses. Occasionally, one of the vNICs will stop working and we will lose Live Migration on that blade, or it may lose communication with the cluster depending on which virtual interfaces have failed.

Ive tried updating the drivers on both sets of physical NICs, reflashing the firmware, turning on/off certain subsystems such as VMQ or RSC etc but nothing has solved this. The interesting thing to note is that if I toggle VMQ on or off it sometimes caused the affected NICs to start responding again, but only for a limited time. I should mention no where in the Network Connections does it state these NICs are malfunctioning or disconnected, it does however list it in Event Viewer as a clustering failure.

edit: when I say the NIC is not responding Im meaning the other hosts can not ping it even though it should. Yes Firewall is off
0
Comment
Question by:Lumenix
  • 5
  • 4
10 Comments
 
LVL 57

Assisted Solution

by:Cliff Galiher
Cliff Galiher earned 250 total points
ID: 40538613
1)  Disable VMQ. There is no use for it on gigabit adapters.

2) Grab updated Broadcom drivers. HP has been sadly very slow on updating drivers, and Broadcom is (also sadly) regularly subpar on driver quality. So combine the two, and you have a bad Broadcom driver that they have (probably) fixed but that HP hasn't rebranded and re-released yet.  Personally, I'd go with Intel, but depending the blades you chose, that may not be an option.

3) Make sure you've implemented reasonable QoS settings on your various vNICs. Otherwise the virtual switch won't prioritize packets and you can eventually end up with a vNIC feeling starved, even after load has resumed "normal" low levels. That's the nature of dynamic teaming and running a converged network. QoS is mandatory in such a setup to ensure no single vNIC can crash the others.

-Cliff
0
 

Author Comment

by:Lumenix
ID: 40538907
Thanks for suggestion Cliff, Ive tried turning off VMQ on my nics using the command.

Get-NetAdapterVmq | Disable-NetAdapterVmq

This hasnt provided a permanent fix unfortunately. Ive also removed all vNICs and recreated them. The same ones are not replying to pings after this either. I checked and updated the Broadcom drivers as well and reflashed the firmware on all four NICs and of course including a reboot. Still the same problem.
0
 
LVL 38

Expert Comment

by:Philip Elder
ID: 40538968
With Cliff. Broadcom requires VMQ to be disabled on _all_ physical NIC ports that run at Gigabit speeds.

Check to see if there is a firmware update for the NICs as well.

In this scenario we would:
 Team 1: Port 0 on each: Management (VLAN for services if required)
 Team 2: Port 1 on each: vSwitch (not shared with OS) (VLAN for VMs via Hyper-V vNIC Properties)

Philip
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:Lumenix
ID: 40544359
Thanks for the suggestions. I have checked and it seems VMQ is not enabled on the NIC Team but the problem persists. Is there some other way to disable it instead of in Powershell?

Ive managed to fix the problem by using a single NIC instead of a teamed one, however then we lose fault tolerant networking to the blade. Im trying to see if I can use the Broadcom utility (BASC) to configure a NIC team and see if the problem persists there. If any of you have further suggestions let me know please!
0
 
LVL 38

Expert Comment

by:Philip Elder
ID: 40544707
Not the team. The ports.

Click Start --> ncpa.cpl --> pNIC Properties --> Advanced --> Virtual Machine Queues (VMQ) --> Set DISABLED.

Do that for all physical NICs.
0
 

Author Comment

by:Lumenix
ID: 40545346
Interestingly enough that option isnt there. The model of NIC is HP NC373i and NC373m. On our HP G6 blades, which Broadcom BCM57711e 10Gbe the option is there but I have not had the VMQ issue on these blades yet.
0
 
LVL 38

Expert Comment

by:Philip Elder
ID: 40545359
10GbE works fine with VMQ. It is on 1Gb connections that things get munged.
0
 

Author Comment

by:Lumenix
ID: 40547004
Alright, Ive done a fresh OS install and configured the vNICs using the BASC team instead of Windows software teaming. Everything works fine for now Ill maybe update later on for those who stumble across this post in the future. I do have one more thing to ask however. It looks like the Physical NICs Im using do not support VMQ anyways since there is no option to turn it on or off. However when running Get-NetAdapterVMQ is shows the NIC team (BASC) as using VMQ...

I try to disable it in Powershell and it tell me it cannot set the property to disabled, any ideas?
0
 
LVL 38

Accepted Solution

by:
Philip Elder earned 250 total points
ID: 40547065
The Broadcom management software may expose those settings.

If the actual physical NIC port does not show them then perhaps they are not supported at all as you say. If that is the case then the OS setting should be meaningless anyway.
0
 

Author Closing Comment

by:Lumenix
ID: 40547388
Wasnt the actual solution was was very valuable info for this issue. Thanks guys
0

Featured Post

VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Sometimes drives fill up and we don't know why.  If you don't understand the best way to use the tools available, you may end up being stumped as to why your drive says it's not full when you have no space left!  Here's how you can find out...
Moving your enterprise fax infrastructure from in-house fax machines and servers to the cloud makes sense — from both an efficiency and productivity standpoint. But does migrating to a cloud fax solution mean you will no longer be able to send or re…
In this Micro Tutorial viewers will learn how to restore single file or folder from Bare Metal backup image of their system. Tutorial shows how to restore files and folders from system backup. Often it is not needed to restore entire system when onl…
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question