transfer rate between esxi hosts very slow

Dear experts,
I have a strange problem with speed between two ESXi 5.1 hosts managed by a vCenter.
The machines involved are two Dell PowerEdge 2950 (local datastore, one has 15k disks, the other 7k2) , connected together by redundant gigabit network.

Each of them has a local datastore on which runs some VMs.

If I do a file transfer between this hosts by Datastore Browser or by a backup software (VMX emplorer), the speed that I reach is about 6/10 MB/sec.

You might think of a problem resident on the network, but performing a file transfer between VMs runnings on different host , I reach good performance (60/100 MB/sec). I also try to disable redundant nic for sake, but problem stiil remains.

We may think of datastore I/O problem of one of the two hosts, but it is not so because benchmarking each datastore, each result is good.

Any suggesitons?

thank's a lot

andrea
Andrea_CorboAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
giltjrConnect With a Mentor Commented:
I beleive that tcpdump should exist on esxi.

You can capture the traffic and write it to a file, then transfer the file to your local computer and use Wireshark to look at the capture.

Something like:

tcpdump -s 0 -i xxxxx -w file01.cap

The character that follows the -s is the number zero.

where xxxxx is the name of the interface you want to capture on.
0
 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
it's likely to be caused by the speed of reading and writing to VMFS partitions.
0
 
Andrea_CorboAuthor Commented:
...may be...
I try to give you even this element: I also have a nfs shared storage (very cheap, qnap basi model).
From and to this nfs storage,  the speed reached by each ESXi hosts is the maximum possible for this nfs storage (30MB/sec).

So I remain confused...
0
What Kind of Coding Program is Right for You?

There are many ways to learn to code these days. From coding bootcamps like Flatiron School to online courses to totally free beginner resources. The best way to learn to code depends on many factors, but the most important one is you. See what course is best for you.

 
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
are jumbo frames enabled on your VMKernel?

are the RAID and disk types the same?
0
 
Andrea_CorboAuthor Commented:
I just setted up MTU on VSwitch  at 9000 (also anabled in phisic HP switch).

NOW file transfer speed from server B to server A is back to over 100MB/sec, but if i do inverse operation, from SERVER B to A (copying the same file), speed is again slow, at 10MB/sec...

the raid is 6 type on both servers.

thank's a lot for your support
0
 
giltjrConnect With a Mentor Commented:
How are you doing the file transfer?

Do a packet capture for just a few seconds in each direction.  Verify that jumbo frames are being use in both directions.  If the file transfer method uses TCP, verify the window size is the same, or close, in both directions.
0
 
Andrea_CorboAuthor Commented:
errata corrige to the previous post: speed problem persists.

I have tried and tested for many hours (iperf, sqlio, etc.), but  any host to host transfers is slow  (vmotion, clone, vSphere replication, past / copy from the datastore browser) from 5/6MB to 15MB/sec.

On Monday I'm going to change / try new switches.

I will update you, thank's for now....
0
 
giltjrCommented:
What is the RTT if you ping the hosts from each other?
0
 
Andrea_CorboAuthor Commented:
Ping from host 10.1.1.204 to 10.1.1.206 (dell server)

PING 10.1.1.206 (10.1.1.206): 56 data bytes
64 bytes from 10.1.1.206: icmp_seq=0 ttl=64 time=2.950 ms
64 bytes from 10.1.1.206: icmp_seq=1 ttl=64 time=0.620 ms
64 bytes from 10.1.1.206: icmp_seq=2 ttl=64 time=0.371 ms

consider that between this hosts is now running a backup job and vmware replication....
0
 
giltjrCommented:
So no obvious errors on the pings.  A packet capture, just a few seconds in each direction, may show what is going on.

What is the CPU utilization like?  If the CPU's are being maxed out, it will affect network transfer rates.
0
 
Andrea_CorboAuthor Commented:
Cpus and RAM in both server are very low. About packet capture, is there something embedded on esxi host console or do I mirror eth port on my switch and get data?

thanks
0
 
Andrea_CorboAuthor Commented:
This is a good idea, I will do this tests on friday and then I will update you.  This morning I spoke with a good Vmware technician  and he too was a bit 'surprised...

Bye bye
Andrea
0
 
Andrea_CorboAuthor Commented:
hello,
Friday 'I could not go in the datacenter. Should I go tomorrow afternoon. I keep you updated.

Thank you.
0
 
Andrea_CorboAuthor Commented:
Hello guys,
Today I tested the networking a lot, also with new switch and the issue between this two host persists.
Tomorrow I will reinstall Esxi5.1 on one host, than I'll do packet capturing like suggested by GILTJR.

bye bye
0
 
Andrea_CorboAuthor Commented:
Hello everybody,
nothing has emerged from the recent tests done.
Yesterday I opened an incident in vmware.
when the issue will be resolved I will inform you.

thanks,
good day,
andrea
0
 
giltjrCommented:
Did the packet capture show a long delay anyplace?  Of course here "long" is relative, instead of 0.05 ms it might be 0.1ms.
0
 
Andrea_CorboAuthor Commented:
Hi there!
the only thing vmware support found is that packet is divided whit a mtu of 60 . The problem however doesn't came from switch.  Next week I'm going to reinstall Hypervisor on that server and let's see...

I'll keep you informed.

bye bye Andrea

ps: I am very disappointed with the support received from vmware
0
 
giltjrCommented:
Using a MTU of 60 is going to cause some serious performance problems.

I would start looking at all Ethernet interface and see if you can find with with MTU or Ethernet framesize set real low.

Could be somebody meant to set it to 6000 and did a typo.
0
 
Andrea_CorboAuthor Commented:
However we are going to setup esxi again, because we have lost too much time.

thank you all for the support....

closing the post

I hope the problem will go away!!!
0
 
giltjrCommented:
I hope it goes away too.  If it does not, then you need to start looking at other network equipment.

If the hosts are in the same ip subnet/vlan then it may be a ESXi issue.  If they are in different ip subnets/vlans then start looking at any/all routers/L3 devices.

Good Luck!
0
 
Daniel J. GarciaCommented:
I am pretty sure that ESXi is limited in speed from the shell on purpose. When I use my own C program to copy data over a socket, speeds reaches 70-80 mb/s at sustained rates. After a few tries the speed starts to slow down until it gets stucked at 10 mb./s
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.