Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


Poor network throughput in Hyper-V guests

Posted on 2015-01-06
Medium Priority
Last Modified: 2015-03-12
Here's the situation...

3 x Windows 2012 R2 hyper-v host machines.

Each host has 4 x Gb NICs connected as follows:

Host A
nic1 - host management subnet
nic2 - vm subnet1
nic3 & 4 - vm subnet2 (teamed)

Host B
nic1 - host management subnet
nic2, 3, 4 - vm subnet2 (teamed)

Host C
nic1 - host management subnet
nic2, 3, 4 - vm subnet3 (teamed)

The teamed nics are aggregated on a layer 3 switch using LACP and setup in Windows using LACP with Dynamic load balancing.

The hosts and switch are not in production - so there is no/negligible background data transfer taking place.

If I copy data between the POSE on any two hosts, transfer speed is around 1Gbps - which is what I would expect as the single NICs connected to the management subnet would be used.

If I copy data between a VM on host B and a VM on host C I would expect speeds of >1Gbps given that LACP and dynamic load balancing are being used across 3 x Gb teamed nics.  However, the transfer speed is very erratic and jumps up and down from 0 to 24Mbps.  Pinging between the same VMs produces equally erratic results - 1ms, 180ms, 5ms etc.

Similar results are gained when copying between VMs on host A and host B that are on the same subnet.

If I copy data between VMs on the same host using the same subnet (which as far as I'm aware should never actually reach the physical switch and so be very fast) - speed is around 150Mbps.

Initially this appeared to me to be something to do with the NIC teaming.  However, if I copy data from a VM on host A connected to subnet 1 using a single NIC I still only get the slow/erratic speeds.  What I would say is this particular VM is multi-homed with subnet 2 using the teamed NIC - so could still be related to teaming.

All VMs are gen1 using synthetic NICs.

Any idea what is going on?
Question by:devon-lad
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 2
LVL 59

Accepted Solution

Cliff Galiher earned 2000 total points
ID: 40534531
Regarding copying data between two VMs on the same host, one thing none of your description covers is disk configuration or VM VHD placement. A solid 150MB/s may be perfectly normal if the server has reached disk I/O saturation because of the copy from and to the same physical spindles, number of disks, etc.

As far as the rest, I would be inclined to think it isn't teaming, but something closer to hardware. *MOST* switches LACP implementations leave a lot to be desired and more often than not kills performance. I only recommend it with high end Cisco and Procurve switches and, even at that, only in specific implementations. The cost of a high end core switch tends to be so high that you can get more bang for your buck just by going to 10Gb NICs and switches for a smaller network.

If these are Broadcom NICs, go in and disable VMQ. Windows doesn't use VMQ at 1Gb speeds anyways, but a persistent driver bug still kills performance on Broadcoms. Although usually that'd surface in your POSE copy tests too. Still worth pointing out.
LVL 88

Expert Comment

ID: 40534599
You don't tell us anything about the guests. What OS are they? Are the Integration services installed? Newer m$ OS's have those included, but older ones don't. Besides, it always is a good idea to install them manually, even if they are included with the OS. OS updates can also help. With non m$ OS's they won't be installed by default, so there you must do that manually anyway.

Author Comment

ID: 40535108
Disk configuration -  it's all on a fibre channel SAN - slowest component is 6Gbps.  If I do a copy from one host to itself there are no speed issues - only when using the VMs.
Switch - it's an HP 1910 switch - would you see these as falling under the category of "LACP implementations leave a lot to be desired" ?
NICs - yes these are Broadcom ones, I had wondered about VMQ.  Surely a VMQ issue is related to VMs only and wouldn't affect copies between physical hosts?  I will try disabling to see if there's any effect.

rindi - guests are all Win 2012 R2 with integration services installed and all updates.
Put Machine Learning to Work--Protect Your Clients

Machine learning means Smarter Cybersecurity™ Solutions.
As technology continues to advance, managing and analyzing massive data sets just can’t be accomplished by humans alone. It requires huge amounts of memory and storage, as well as high-speed processing of the cloud.


Author Comment

ID: 40535201
Cliff - after disabling VMQ on all NICs there is a marked improvement in transfer speeds.  Still slightly erratic - but getting near 1Gbps most of the time.  The LACP teamed NICs don't appear to be getting anything over this though.

Author Comment

ID: 40535272
Ah hangon, seem to remember that no single transfer process will ever get more than the maximum speed of a single NIC.  Only way to get more is if you have more than one process transferring data.  Is that right?
LVL 59

Expert Comment

by:Cliff Galiher
ID: 40535860
That depends on the LACP implementation (again.) Most LACP switches do load balance based on a hash of packet data that makes it per-stream/flow. So getting above the speed of a single NIC requires multiple flows. Higher end switches have smarter algorithms though so getting better throughput even with a single stream is possible. But if you are seeing the upper limit is 1Gb then you are probably dealing with a basic LACP balancer.

Author Comment

ID: 40660619
Cliff - I have a follow on question if you're able to take a look


Featured Post

Fill in the form and get your FREE NFR key NOW!

Veeam® is happy to provide a FREE NFR server license to certified engineers, trainers, and bloggers.  It allows for the non‑production use of Veeam Agent for Microsoft Windows. This license is valid for five workstations and two servers.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A look into Log Analysis and Effective Critical Alerting.
Ransomware is a malware that is again in the list of security  concerns. Not only for companies, but also for Government security and  even at personal use. IT departments should be aware and have the right  knowledge to how to fight it.
In this video tutorial I show you the main steps to install and configure  a VMware ESXi6.0 server. The video has my comments as text on the screen and you can pause anytime when needed. Hope this will be helpful. Verify that your hardware and BIO…
This course is ideal for IT System Administrators working with VMware vSphere and its associated products in their company infrastructure. This course teaches you how to install and maintain this virtualization technology to store data, prevent vuln…

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question