Link to home
Start Free TrialLog in
Avatar of michaelkim1
michaelkim1

asked on

Write latency spikes on esxi backed by illumos zfs NFS

I am running ESXi 5.5u2 with the VMs stored on an illumos NFS server (omnios) with the zfs file system.  Average write latencies are low for the most part (< 1 ms average), but several times per hour, there will be spikes into the 50-100 ms range.  >75% of the time, these will be from domain controllers, but other VMs will randomly have similar spikes in write latency.

Examining write throughput and write operations per second shows no spikes in either metric during these events.  Likewise, read rate and read ops are low during these events.

zfs is configured with a Zeusram slog device.  The IOPs running through the Zeusram are not anywhere near their limit (the max I have seen is 500 write iops and the limit is in the 10s of thousands ).  In order to get around the vsphere single tcp connection limit for NFS shares, the illumos server is multihomed with several ip addresses and each datastore is tied to one of those ip addresses.  This prevents a single tcp connection for all VMs; each VM gets its own private tcp connection.  The omnios server has 32gb ram and 10tb of storage so there is not a RAM deficit.

I've tried tweaking multiple illumos kernel parameters, NFSd parameters, vmx config file parameters, and vsphere parameters, all to no effect.  I am unable to determine why these intermittent spikes are occurring.    

Admittedly, they are not very high latencies, but there has to be a reason for them.  It does not make sense that throughput and iops load should be low when these spikes occur.

Does anyone have any suggestions on where to look for the root cause?
Avatar of Aaron Tomosky
Aaron Tomosky
Flag of United States of America image

Can you post arcstats output (I'm used to FreeBSD but I believe illumous has the same)?
How many disk drives and what type of drives are in the storage system?
How many VMs in the environment?
How many users do you have?
Avatar of michaelkim1
michaelkim1

ASKER

Here is the output from "arcstat 1":

User generated image
I have 25 VMs running.  There are only 10 users.  There hard drives are Western Digital RE4 2TB.  The cache is a Intel SSD.  The slog device is a ZeusRAM 8GB.  Here is the zpool config:

User generated image
Are your VMs on local storage?  If so do you have a Battery back write cache controller?  We've seen weird issues in many environments that it wasn't installed on and configured properly.
How big is the l2arc ssd? That's a different arc stats than I'm used to. I was hoping to see arcmetalimit and arcmetaused and evict stats... that sort of thing.
ASKER CERTIFIED SOLUTION
Avatar of michaelkim1
michaelkim1

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Remember that all the l2arc has to be referenced in ram and takes away from primary arc space. 32gb isn't that much for a zfs nfs server when you consider arc, l2arc, and metadata. Hopefully your aren't using dedupe as that uses a ton of ram.
Thank you for the suggestion regarding the RAM and the l2arc.  I had already tried increasing the RAM to 128GB without any change in the intermittent spikes.
Other suggestions were useful but did not help arrive at the solution.