We help IT Professionals succeed at work.

SFP+ Connection bad performance

Konstantin Krolikov
Konstantin Krolikov used Ask the Experts™
on
There is existed 2 servers which located in same subnet.

Server 1
Name: Storage-01
Type: Dell T430
Network card: Intel(R) 10G 2P X520 Adapter
192.168.50.240 (10GB)
192.168.50.248 (1GB)

Server 2
Name: Hyper-V-01
Type: Dell T430
Network card: Intel(R) 10G 2P X520 Adapter
192.168.50.253 (10GB)
192.168.50.251 (1GB)

They both connected over Dell brand DAC cable SFP+ of 5 meters to Dell switch N1548 to SFP+ ports.

But when i'm copying file or executing iperf between, performance are really bad.
About 16 MBytes (136Mbits).

When i'm doing copy of any file from any Workstation i do receive performance of 113MB as expected. Like 1GB connection of workstation that connected to same Dell switch.

I was thinking that i have switch issues or cables. I took brand new cable and connected peer to peer and configured them in different subnet like 50.0.0.1 and 50.0.0.2 and did same test but received almost same result.
About 17.2 MBytes (150Mbits)

Maybe some configuration is wrong ?
Maybe i have issues with network card ?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2014

Commented:
What happens if you connect them directly together rather than going through the switch?

BTW, it would be faster and more reliable if you just put the disks in server 2!
David FavorFractional CTO
Distinguished Expert 2018

Commented:
andyalder voiced my suspicion.

Likely your switch is set to a slow manual speed, rather than auto select for highest speed.

Or... one or both of your machines involved are set to slow manual speed.

Best to ensure all your switch ports + each machines physical interfaces are all set to auto select for fastest speed possible.

Author

Commented:
This what i did when received better performance of 150Mbites.

New SFP+ of 10GB from server 1 to server 2.
With 0.5 length cable.

Author

Commented:
I checked it.
In show interfaces status i see them as 10000. And at Windows server 2012 R2 OS at network adapter i see 10GB on both servers.

I even tried to change to different ports.

I was thinking, maybe this related to some Bios settings or Network card advanced settings like RSS or some jumbo frames but tried a lot of variations on both side and result looks same.
Top Expert 2014

Commented:
The new bottleneck may be storage if it is hard disks rather than SSDs.

Author

Commented:
🤔 I'm not sure in that because of few reasons.
1. I'm able to get better performance while copying files from both servers to any workstation.
2. There is RAID5 of 6 SAS Disks.

But maybe this is good idea to creat virtual ram disk on both servers. Then to share them and trying to copy files between servers avoiding disks.

Author

Commented:
Some interesting fact, i did few additional tests and figure out something weird. Looks like this is combination of few issues.
1. I checked connectivity from Storage server to Hyper-V server over 1GB connection (Over switch) and i see that they are limited to same performance. 16 MBytes (140 Mbits).
2. But when i did reboot to switch current has changed to normal for less then minute and then fall again to same performance.
3. And even after degradation of performance between server, communication between workstations and both servers still stays as 1GB.
Instant-Degradation-1GB-to-100MB.png
Instant-Degradation--2--1GB-to-100MB.png
Top Expert 2014

Commented:
The 10Gbe ports on the switch are described as uplink ports so it may not be valid to connect them to anything other than another switch.

Author

Commented:
Current answer is out of logic, cause switches are expected to exchange data between them in speed of 10GB. In reason of that there is existed SFP+ ports.
More then that this switch is stackable, so try to think that you have stack of 4 switches like this connected with 10GB cable. You are really think that traffic between them will not reach 10GBs ?
And more then that, why degradation of performance falls to 150Mbits and not 1GB. Cause it is two level jump that should that is can't be done by switch. From 10GB to 1GB and then to 100MB. Current can't be done by switch.
Like in switches of 40GB you can jump to 10GB but can't directly to 1GB.
After wireshark analysis was detected root cause of this issue.
Looks like servers has pushed Group Policy with IPSec. Servers was encrypting all communication between them, but workstations in current subnet was not using IPSec this why there was no performance degradation to any of workstations.
Top Expert 2014

Commented:
I was more interested as to why you had separated your server from its storage in the first place.

Author

Commented:
Its always questions like this when you don't know system design and doesn't see whole architecture.
Top Expert 2014

Commented:
Well, at a guess I would think you are planning to use two or more servers with some form of storage replication to provide shared storage for two or more hyper-v hosts. That is a very inefficient use of hardware as the storage servers CPUs and RAM are underutilised as are the disk subsystems in the hosts; it is cheaper and sometimes faster to use a hyperconverged approach.

Author

Commented:
Of course i was going on NetApp or EMC in this case and some 10GB switch with low latency :-) (Or even some Dell VRTX / FX cage)
But as i told, different design...
;-)
Top Expert 2014

Commented:
You do not have to buy something like VRTX to be hyperconverged, a simple two node cluster with StarWind or even MS storage replication is hyperconverged so long as each physical node does both the storage and the host functions.