Link to home
Start Free TrialLog in
Avatar of msidnam
msidnamFlag for United States of America

asked on

Troubleshooting bandwidth issues on vSwitch ad psychical switch. 10GbE

It's been a while since I've had to troubleshoot vSwitch and physical switch issues so I'm a bit rusty. Our VMWare environment has two 10GbE switches stacked and each ESXi host two 10 GbE NIC ports, one to each switch. Both NICS are active inside the Teaming and Failover settings under the port group.

What I'm noticing is if I copy a file from one server to another I am not getting the full 10GBE. I know I won't use all of it but what I'm seeing is during the first few seconds on the file transfer it goes to about 600MBps, then it quickly goes down to less than GB speed (maybe around 50-60MBps.

VMWare tools is up to date on all the servers/ I'm not seeing any errors on the physical switched. I'm just not sure what other settings I need to check. Whether it's on the physical switch side, vSwitch side or VM itself.

I'm running Windows 2012 R2 for both servers sharing the files and transfering back and forth.
Avatar of atlas_shuddered
atlas_shuddered
Flag of United States of America image

Couple things:

1.  Have you checked the physical switchports themselves for errors?
2.  Have pulled a capture from the ports and examined what's going on inside the TCP stream?
what network interface are you using in the VM ?
Avatar of msidnam

ASKER

atlas,
I do not see any errors. Some of the ports do say oversized packets. I have not done a wireshark capture but I can.

Andrew,
I'm using VMXNET3 for the servers.
I can't speak to the VM side but from a network side, I'd be grabbing a cap.  You still may not see anything that is the magic bullet but you may find something that leads you to the next step closer.  

I'd check for retrans, window size transitions, etc.
What size MTU do you have configured on your switch ports connected to the vmware hosts?
Avatar of msidnam

ASKER

the physical switch ports and the vswitch are configured for 1500 MTU. I was considering making them 9000 but I don't know if that will impact anything else.
ASKER CERTIFIED SOLUTION
Avatar of Soulja
Soulja
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of msidnam

ASKER

Disk performance is something I considered but forgot to check. We use a Hyperconverged solution called Maxta. They dont go over iSCSI but I will do some digging to see if its the HDD.

My concern with the jumbo frames would be if something doesn't deal with the packets correctly. I've heard in the past that some internet sites wouldn't work. I would imagine that they should nowadays but i didnt want to make things worse.

I do have a DR environment thats pretty much an exact duplicate of my prod site that I can probably test on. the only issue with that is it's in another state so if i mess that one up, its a plane ride. its always something with It. haha.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of msidnam

ASKER

After speaking with Maxta it seems as though I am hitting a bottle neck when reading from the HDD's.  Unfortunately I do not have any other physical servers that's not an esxi host with 10GbE to do anymore testing so I will have to go with what they are saying.

Thank you.