huntson
asked on
ESXi to Synology NAS iSCSI slowdown out of nowhere
Out of nowhere I'm experiencing incredibly slow connections between my esxi hosts and my synology NAS. I have 3 hosts, 2 Mac Minis running in HA and a Dell Server. My Synology NAS is on the latest firmware with 3 of the links link-aggregated. I noticed while copying files to one of my VMs over the network (the VM had a share going over AFP) that I was getting maybe 12 MB/s. Then I noticed some of my other systems were quite sluggish. I tried creating a new iSCSI LUN with some drives that have not been used on the NAS. While it did show up on my esxi hosts and I was able to format it, migrating to it has taken too long and keeps timing out. I've tried restarting the NAS units and the hosts. All drives in the NAS appear to be healthy. No configuration changes have been made for at least a month or two but I did check all settings in esxi and on my network switch to no avail.
Any ideas?
Any ideas?
Are you seeing any packet loss? I'd check your switch configuration to make sure that's still in tact and confirm 100% nothing has grabbed the same IP of the NAS. I had a very similar issue with a NAS for a customer who could never get it working properly where the same IP was assigned to a rogue device.
ASKER
Thanks for the quick reply. As was stated in my second to last sentence - no changes were made in at least two months.
I am using jumbo frames and they are enabled everywhere. I tried turning them off but the Synology Units won't allow me to without breaking apart the cluster and then putting it back together - not something I am keen on doing seeing as this all worked fine and FAST a few days ago.
I am using jumbo frames and they are enabled everywhere. I tried turning them off but the Synology Units won't allow me to without breaking apart the cluster and then putting it back together - not something I am keen on doing seeing as this all worked fine and FAST a few days ago.
ASKER
I'll have to shut down the NAS to test and see if I can ping anything else on the network. I doubt that's the case but I'll get back to you shortly.
time for some vmkpings from your hosts to your NAS
can you screenshot your networking on the hosts
not upgraded DSM at all ?
can you screenshot your networking on the hosts
not upgraded DSM at all ?
ASKER
There is no other unit on the network with those IPs.
ASKER
I have not upgraded DSM on the server in about a month or two as I stated. Attached are some screenshots from my Dell system. I'll admit I'm not necessarily the most experienced esxi guy so can you explain vmkpings please?
shot1.PNG
snip-2.PNG
shot1.PNG
snip-2.PNG
Although no changes haven't been physically made I've seen switches lose configuration data in various situations. Confirming the switch configs is definitely the way to go. In esx make sure all the paths to the NAS are live.
ASKER
Indeed I have checked. All the ports are set for jumbo frames, the LAGS are up, and vlans are all flat.
Is there any packet loss when you start a running ping to the NAS?
ASKER
Is this the vmkping run from an ssh terminal on the host?
yes, it's the equivalent of ping, but it's a special version which sends traffic via the vmkernel port groups.
so provides a better valid test of reaching the SAN
so provides a better valid test of reaching the SAN
ASKER
I see. Had to look it up. it's seeing the NAS fine but I can't seem to find an indefinite command - do you know the flag to keep it pinginging until I stop it?
Also I noticed that on my Macmini the jumbo frames aren;'t passing despite the setting. I'm not really worried about that. Just an FYI.
Also I noticed that on my Macmini the jumbo frames aren;'t passing despite the setting. I'm not really worried about that. Just an FYI.
could be a sign of mismatched jumbo frames ?
ASKER
I just remembered what I did to potentially mess things up. Hate when you forget just when you need to remember!!!
Only 4 TB of the total 5tb of the volume was provisioned. I thin provisioned an additional TB and added that space on the VMware side. The migration went well. Any idea why this would cause that problem?
Only 4 TB of the total 5tb of the volume was provisioned. I thin provisioned an additional TB and added that space on the VMware side. The migration went well. Any idea why this would cause that problem?
Is your Synology on the VMware HCL and Supported ?
ASKER
It is indeed.
have you used all the space on the NAS now ?
e.g. the NAS volume is completely full, as you expanded the LUN, e.g. thin ?
e.g. the NAS volume is completely full, as you expanded the LUN, e.g. thin ?
ASKER
Yes. I provisioned all the space.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
was the corruption the cause of the slow down?
ASKER
This is to be determined. Once I recreate the volume we can be certain. Give me a few days.
I think I would pause, and get some backups!
ASKER
After working on my backup solution and re configuring some items I'm still having speed issues. I moved everything temporarily to LUNs on another system and then removed all the volumes on the system I was working on and re-created them.
While copying files back through vcenter, according to the resource monitor in Synology I'm only seeing write speeds to the unit approaching 10 MB/s. Using simple SMB or AFP to shares on the Synology box yields results approaching 80 MB/s. I forgot what I had done in the path however I'm not sure if I should have a separate target for each system that wants to connect or if I should allow multiple initiators to connect to one target.
Let me know if I can check any other settings to help improve this performance.
While copying files back through vcenter, according to the resource monitor in Synology I'm only seeing write speeds to the unit approaching 10 MB/s. Using simple SMB or AFP to shares on the Synology box yields results approaching 80 MB/s. I forgot what I had done in the path however I'm not sure if I should have a separate target for each system that wants to connect or if I should allow multiple initiators to connect to one target.
Let me know if I can check any other settings to help improve this performance.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I troubleshot and found the issue through repeated testing.
checked all equipment is configured for jubmo frames