We have 4 1TB NFS stores carved out on a VNXe3200 . these stores are being used by two VMware servers. There's also LUNs setup on the EMC being used by a separate physical Windows Server.
2 of the 4 NFS stores are suffering from sever latency. Every few seconds, could be a couple hundred MS but we've seen 20k+ MS. Best we can tell the to other two NFS stores are fine. and the LUNs are fine.
Three different EMC techs have looks at the EMC SAN and say it's fine. they pulled another set of logs and such and are analyzing. Only connection is that they are both using SPA - but they're not seeing issues with SPB nor are the LUNs showing latency.
NFS no iSCSI so most issues with framing and such don't apply. besides everything is setup the same way and only 2 of the stores are effected. had one VMware tech look - in VMware didn't see anything. have case open with VMware. so far no one can tell me anything of interest. Some of the more common issues like the IOPs bug in VMware bug we checked for and eliminated - all VMs set for unlimited IOPs.
There's 5 VMs using these two stores and they are NOT at all taxing the CPU, RAM or storage. all very secondary servers. my most busy server is on one of the other stores and it's fine.
any ideas? since myself, my team, 2 EMC engineers, and 2 techs from the installed that put in the SAN have looked at this - I think we can forgo some of the "easy" items unless it's really easily overlooked. The VMs used to be connect to a different SAN using iSCSI but that was late last year. this issue seems to be fairly new - like weeks but seems to be getting worse.
@Carlos: we talked about doing that this weekend, so may try that and see if that does isolate SPA as the issue or not. We have a ticket open with EMC. they have about 6 hrs of looking at this SAN and at still looking at logs.