ESXi to Synology NAS iSCSI slowdown out of nowhere

huntson
huntson used Ask the Experts™
on
Out of nowhere I'm experiencing incredibly slow connections between my esxi hosts and my synology NAS.  I have 3 hosts, 2 Mac Minis running in HA and a Dell Server.  My Synology NAS is on the latest firmware with 3 of the links link-aggregated.  I noticed while copying files to one of my VMs over the network (the VM had a share going over AFP) that I was getting maybe 12 MB/s.  Then I noticed some of my other systems were quite sluggish.  I tried creating a new iSCSI LUN with some drives that have not been used on the NAS.  While it did show up on my esxi hosts and I was able to format it, migrating to it has taken too long and keeps timing out.  I've tried restarting the NAS units and the hosts.  All drives in the NAS appear to be healthy.  No configuration changes have been made for at least a month or two but I did check all settings in esxi and on my network switch to no avail.

Any ideas?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
any configuration changes, are you using jumbo frames ?

checked all equipment is configured for jubmo frames
Britt ThompsonSr. Systems Engineer
Top Expert 2009

Commented:
Are you seeing any packet loss? I'd check your switch configuration to make sure that's still in tact and confirm 100% nothing has grabbed the same IP of the NAS. I had a very similar issue with a NAS for a customer who could never get it working properly where the same IP was assigned to a rogue device.

Author

Commented:
Thanks for the quick reply.  As was stated in my second to last sentence - no changes were made in at least two months.

I am using jumbo frames and they are enabled everywhere.  I tried turning them off but the Synology Units won't allow me to without breaking apart the cluster and then putting it back together - not something I am keen on doing seeing as this all worked fine and FAST a few days ago.
11/26 Forrester Webinar: Savings for Enterprise

How can your organization benefit from savings just by replacing your legacy backup solutions with Acronis' #CyberProtection? Join Forrester's Joe Branca and Ryan Davis from Acronis live as they explain how you can too.

Author

Commented:
I'll have to shut down the NAS to test and see if I can ping anything else on the network.  I doubt that's the case but I'll get back to you shortly.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
time for some vmkpings from your hosts to your NAS

can you screenshot your  networking on the hosts

not upgraded DSM at all ?

Author

Commented:
There is no other unit on the network with those IPs.

Author

Commented:
I have not upgraded DSM on the server in about a month or two as I stated.  Attached are some screenshots from my Dell system.   I'll admit I'm not necessarily the most experienced esxi guy so can you explain vmkpings please?
shot1.PNG
snip-2.PNG
Britt ThompsonSr. Systems Engineer
Top Expert 2009

Commented:
Although no changes haven't been physically made I've seen switches lose configuration data in various situations. Confirming the switch configs is definitely the way to go. In esx make sure all the paths to the NAS are live.

Author

Commented:
Indeed I have checked.  All the ports are set for jumbo frames, the LAGS are up, and vlans are all flat.
Britt ThompsonSr. Systems Engineer
Top Expert 2009

Commented:
Is there any packet loss when you start a running ping to the NAS?

Author

Commented:
Is this the vmkping run from an ssh terminal on the host?
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
yes, it's the equivalent of ping, but it's a special version which sends traffic via the vmkernel port groups.

so provides a better valid test of reaching the SAN

Author

Commented:
I see.  Had to look it up.  it's seeing the NAS fine but I can't seem to find an indefinite command - do you know the flag to keep it pinginging until I stop it?

Also I noticed that on my Macmini the jumbo frames aren;'t passing despite the setting.  I'm not really worried about that.  Just an FYI.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
could be a sign of mismatched jumbo frames ?

Author

Commented:
I just remembered what I did to potentially mess things up.  Hate when you forget just when you need to remember!!!

Only 4 TB of the total 5tb of the volume was provisioned.  I thin provisioned an additional TB and added that space on the VMware side.  The migration went well.  Any idea why this would cause that problem?
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
Is your Synology on the VMware HCL and Supported ?

Author

Commented:
It is indeed.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
have you used all the space on the NAS now ?

e.g. the NAS volume is completely full, as you expanded the LUN, e.g. thin ?

Author

Commented:
Yes.  I provisioned all the space.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017
Commented:
Yes, I see to remember an issue with Synology NASes, when you allocate all the space to a LUN, which is thin, provisioned, it goes offline, into a read only mode.

e.g. ALL Volume space, the LUN goes read only, and does not get mounted for protection.

this does appear catch-22, if you have VMs, because you would have to add more physical disk to expand the volume on the NAS.

let me try and find the issue, we had with this before.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
was the corruption the cause of the slow down?

Author

Commented:
This is to be determined.  Once I recreate the volume we can be certain.  Give me a few days.
Andrew Hancock (VMware vExpert / EE Fellow)VMware and Virtualization Consultant
Fellow 2018
Expert of the Year 2017

Commented:
I think I would pause, and get some backups!

Author

Commented:
After working on my backup solution and re configuring some items I'm still having speed issues.   I moved everything temporarily to LUNs on another system and then removed all the volumes on the system I was working on and re-created them.

While copying files back through vcenter, according to the resource monitor in Synology I'm only seeing write speeds to the unit approaching 10 MB/s.  Using simple SMB or AFP to shares on the Synology box yields results approaching 80 MB/s.  I forgot what I had done in the path however I'm not sure if I should have a separate target for each system that wants to connect or if I should allow multiple initiators to connect to one target.


Let me know if I can check any other settings to help improve this performance.
Commented:
While I still don't have an answer if it's best to have multiple initiators connect to the same target or have separate targets for each initiator I think I found the primary cause of the slow-down.  In addition to the corruption that was experienced/discussed early-on, it would appear that using a file-based LUN is the fastest performer on Synology systems.  This is exactly the opposite of what the prompts tell you (block or volume-based LUNS, according to the UI, perform faster).

Author

Commented:
I troubleshot and found the issue through repeated testing.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial