Slower performance with Jumbo Frames

I'm having a strange issue! The network setup for this specific thing is pretty simple. I have a computer, switch, server, and storage array.
Computer #1 setup with Jumbo frames enabled.
Switch setup with jumbo frames enabled
Server #1 setup with jumbo frames enabled.

Run ping command on both ends with 8096 packet size and packets send just as they should.

I'm using a software called Trilead to backup my VM's on Server#1 to a local drive on Computer #1. With jumbo frames enabled on computer #1 transfer speed is around 12MB/sec with jumbo frames disabled I get about 45MB/sec. I'm thinking this should have the exact opposite affect!

Any help would be greatly appreciated!
LVL 1
jerrodtracyAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

gheistCommented:
Since you mention vmware - are jumbo frames allowed on all involved virtual switches?
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
It does happen, sometimes Jumbo Frames may or may not improve performance.

Jumbo Frames has been correctly enabled on the Management Port and vSwitch.
0
gheistCommented:
iperf a good tool...
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

actnvCommented:
All devices need to support and be enabled for Jumboframes.  I'd try vmkpink to see where it's failing specifically.  Then, reset the MTU where ever it's failing to 8096.
vmkping info:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1003728
0
giltjrCommented:
When you are running ping did you use the -f  option to disable fragmentation?

If Linux try -m dont
0
jerrodtracyAuthor Commented:
Using putty and SSH'd into my Esxi 5.5 server I use the following command and get the following results.
~ # vmkping -d -s 8972 10.0.0.95
PING 10.0.0.95 (10.0.0.95): 8972 data bytes
8980 bytes from 10.0.0.95: icmp_seq=0 ttl=128 time=0.924 ms
8980 bytes from 10.0.0.95: icmp_seq=1 ttl=128 time=1.353 ms
8980 bytes from 10.0.0.95: icmp_seq=2 ttl=128 time=1.000 ms

From Windows my windows machine I use the following command and get the following results.
ping -f -l 8972 10.0.0.200

Pinging 10.0.0.200 with 8972 bytes of data:
Reply from 10.0.0.200: bytes=8972 time<1ms TTL=64
Reply from 10.0.0.200: bytes=8972 time<1ms TTL=64
Reply from 10.0.0.200: bytes=8972 time<1ms TTL=64

With these result's I'm assuming that jumbo frames are good to pass through all devices.  Am I correct in thinking that? I originally had an issue with it not passing and found that one of my vmk interfaces was not set to allow jumbo frames and then when I fixed it I got the ICMP packets to pass but performance is horrible. Any additional thoughts would be very helpful.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
This is not a new discovery, sometimes jumbo frames, can help performance, and sometimes it can get worse.
0
giltjrCommented:
The results do show that jumbo frames are allowed between your ESX host and what ever 10.0.0.95 is and between your Windows box and whatever 10.0.0.200 is.

Is 10.0.0.200 your ESX host, or is it a VM running on the ESX host?
0
jerrodtracyAuthor Commented:
10.0.0.200 is a Esx host.
10.0.0.95 Windows Physical workstation

10.0.0.95 - Window Phycial Machine > 10ft Cat6  > Cisco SG500x Switch > 10ft Cat6 > 10.0.0.200 - Esx Host
0
gheistCommented:
And if you repeat same experiment against backup server  i.e. same chain as slowdown?
0
giltjrCommented:
Do you want to use Jumbo frames to virtual hosts, or just to the ESX hypervisor?

If you want to use Jumbo frames all the way through to a virtual host, you need to run the ping commands to a virtual machines IP address, not just to the ESX hypervisor's address.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
I assuming this is to speed up the backups with Trilead....

which is from Windows 7 Workstation to ESXi Host Server Management Network.
0
jerrodtracyAuthor Commented:
Yes Andrew is correct it's to speed up backups with Trilead to the Esxi Host network not a virtual machine.  I currently have 3 VM's that are sitting around 2.5 TB of data and it's taking longer that I need to backup. Once I get the initial backup complete the incremental will wont take as long but I would like to increase the performance of the large data sets.
0
giltjrCommented:
O.K.  You may want to try using a frame size of 4000 bytes.   I don't know if it is still true, but at one time most performance tests showed that the biggest gain was going from 1500 bytes to 4000 bytes, that once yo got above 4000 there was very little if any gain compared to 4000 bytes.

As Andrew Hancock has stated sometimes jubmo frames make the performance worse, especially if you get above the 8K range.  Although a lot of devices support jumbo frames, when you try to start actually pumping a lot of data through they don't have the buffers to keep up.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
@giltjr Raises an interesting point, that buffers in the switches, make a big difference here, and cheaper switches, and devices although they may support jumbo frames (which is defined as away size greater than 1500 bytes), may not have sufficient buffers or RAM to deal with the throughput.

We use dedicated Storage Switches from Brocade, and Dell for use with our storage networks, if we use our HP Edge Switches (Desktop connections), performance is poor.
0
jerrodtracyAuthor Commented:
Thanks giltjr & Andrew Hancock,  I will take a look into that.  It does make since that the buffer may not be able to keep up. Andrew do you have a recommendation for an affordable fiber switch. I currently have 3 Host and 2 storage array's. I would only need something with 10-12 ports. The issue I've ran into before is the cost! We are just a small non-profit so cost is always our biggest hurdle.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
The issue is affordable storage switches, the Dell N Series are good.
0
gheistCommented:
dlink unmanaged 5-port has 6 frame latency....
which just means i am lucky:
http://darkwing.uoregon.edu/~joe/jumbo-clean-gear.html
0
jerrodtracyAuthor Commented:
Thanks for all your help! I think that I've found the problem.  @giltjr had mentioned buffer size might be the issue and I know my switches buffer was big enough so I got to thinking it could just be a cheap gigabit nic in my pc.  I did the same test on my mac booted into windows and had significantly better results.  Hitting around 125MB/sec. I've purchased a new network card for my pc and will give an update to confirm that it fixed it on that machine.
0
jerrodtracyAuthor Commented:
Wasn't the exact solution but it got my going in the right direction!
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
So what was your solution, for the Experts ?

or, until we see the network card ......

it's physical switch buffers! What switches are you using ?

what is the nic, in your current workstation ?
0
gheistCommented:
Actually ping as measurement is not correct.
It does not see if tail of packet was chopped and lost.
I would suggest running iperf and see in checksum counters (also with rx csum acceleration off)
0
giltjrCommented:
As long as all devices in the path honors the DF bit, ping will show if everything in the path supports jumbo frames.

Now if some device along the path does not support the larger frames and does not support the DF bit, then problems could arise.
0
gheistCommented:
Switches are not expected to see Layer 3 header with DF bit, and ones badly broken may zero say past 5500 bytes of packet, as ICMP has no checksums nor ping compares sent data to received huge problem may just go unnoticed. Thats why suggestion to test with TCP that knows checksums...
0
giltjrCommented:
Really?  ICMP has no checksum?  That's funny because acording to the RFC it does:

https://tools.ietf.org/html/rfc792 search on checksum.

Also if you look at the header layout its there:

http://en.wikipedia.org/wiki/Internet_Control_Message_Protocol#Header
0
gheistCommented:
There is no data checksum
0
giltjrCommented:
From the RFC:

"Checksum

      The checksum is the 16-bit ones's complement of the one's
      complement sum of the ICMP message starting with the ICMP Type.
      For computing the checksum , the checksum field should be zero.
      This checksum may be replaced in the future."

The ICMP message includes the data.
0
gheistCommented:
Header Checksum

      The 16 bit one's complement of the one's complement sum of all 16
      bit words in the header.  For computing the checksum, the checksum
      field should be zero.  This checksum may be replaced in the
      future.
0
giltjrCommented:
You are misunderstanding what that means.  It does NOT mean that the value in the header will be zero.  It means that  when you run the value that is in the header through the checksum formula the result should be zero.

IP has has a checksum also, but it is only a checksum for the header, not for the full packet.  If you read:

http://en.wikipedia.org/wiki/IPv4#Header_Checksum

You will see it has the same language about checksum being zero, but it also walks you though the process.

Run a packet trace.   See if the IP header and the ICMP header and look at the checksum fields.  You will see that they are not equal to zero.
0
gheistCommented:
"sum of all 16 bit words in the header"

I dont read DATA anywhere
0
giltjrCommented:
You need to read the whole RFC, well not all of it just specific parts.   Go back to the RFC and search for "Message Formats."  Now read that part and you will find the text "Unless otherwise noted under the individual format descriptions, the values of the internet header  fields are as follows:"

You will see that the name of the field is "Header Checksum" and the description is that this is the checksum of all of the 16-bit fields in the header.  This does NOT include the data, just like you are saying.

HOWEVER, now search for  "Echo or Echo Reply Message" and start reading.  The 1st thing you should notice is that the name of the field is now just plain "Checksum", the word header has been removed, guess why.  Now read the description of the Checksum field for a ICMP Echo/Echo Reply message.  It includes the words "ICMP message".  The data is part of the message.

For some type of ICMP messages the checksum includes the length of data.  For other types of ICMP messages, the checksum is a header only checksum.  I would assume that those types of message have no data.
0
gheistCommented:
So - shout at the switch and it will upgrade from 7500byte datagrams to 9000....
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.