VMware I/O Issues

Greetings,

We are using a NetApp appliance for an NFS share that our VMs reside on.  Currently we have on large 10 TB NFS share of which we are allocated 6.6 TB and using about 4.

We recently added 30 new VMs and have been seeing a lot of I/O and disk latency.

More info: Using redundant 10GB connections to the NetApp; Running XenApp PS on all the VMs on the NFS share; About 180 virtual XenApp servers are on this datastore. Running ESXi 5.0.  This is a dedicated NFS, with only VMware.


Is there a way to find I/O per VM?  Is there a reccomendation to resolve this short of adding more spindles or fewer VMs?

Any other info I can provide?

Thanks!
ServerNotFoundAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Nick RhodeIT DirectorCommented:
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
You do have jumbo frames enabled?
0
Paul SolovyovskySenior IT AdvisorCommented:
jumbo frames will not add too much as most of the data is random.  What I would concentrate on is the following:

1.  What type of drives in the aggregate?
2.  Are the NFS datastores pointing to a single IP or each datastore is on a separate IP?
3.  How much space available on each aggregate?
4.  How large are the NFS datasores
5.  How many VMs per datastore

Check out how to run powercli script for iops per datastore (if you don't have Netapp Operations Manager)

http://virtualcurtis.wordpress.com/2010/09/09/gathering-virtual-machine-iops-statistics-by-datastore/

http://dl.dropboxusercontent.com/u/8378981/GatherIOPS.ps1
0
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

ServerNotFoundAuthor Commented:
The backend is NetApp, 50k drives
There is a single NFS/datastore
10TB total, 5 TB free
1, 10TB
117 VMs


Will powercli or NetApp OM help if they are all on the same NFS?
0
Paul SolovyovskySenior IT AdvisorCommented:
Per VM is usually better handled on the VMware side.

Best practice is to have 20 to 30 VMs per datastore and if you have only 1 IP I'd recommend breaking out the datastores in multiple datastores and adding alias to the interface.  If you have etherchannel/lag enabled on the ESXi side and a separate IP for each datastore you will provide for load balancing, otherwise a single ESXi host IP talking to a single Netapp IP will not allow the session to go over a single port.  Creating multiple IP sessions will mitigate this issue

IOPs per VM per datastore script below - powercli

http://www.digitalbeermat.com/vmware/powercli-calculating-iops-of-individual-vms-on-particular-datastores
0
ServerNotFoundAuthor Commented:
This is using NetApp with one giant NFS share.  This is how they reccommended we set it up, with one large 10TB share.

Seemed odd to me.  Any documentation specific to NetApp?
0
Paul SolovyovskySenior IT AdvisorCommented:
Whoever told you that is incorrect and this does not follow VMware nor Neapp best practice, let me see if I can pull the docs for you.

Essentially you have all the IOPs hitting a single volume on a single IP..that is major SPOF.

Are you using SMVI for backups?
0
ServerNotFoundAuthor Commented:
The one IP does happen to be 2- 10GB ethernet links.  I would think that would help there.  Would additional IPs increase the IOPs possible?

Thanks for looking for the docs also, I would love to see them.
0
Paul SolovyovskySenior IT AdvisorCommented:
It would provide better performance, essentially you're only using one of the 10GB ports since a session is ip based.  I don't know how much of the 10GB link you're using but at this point the second port is for failover only
0
ServerNotFoundAuthor Commented:
Rodger.  If you've got those docs I'll close this out and assign the points.  Thanks!
0
Paul SolovyovskySenior IT AdvisorCommented:
I've went through the best practice guide and it just gives a Max. of 16TB per datastore but nothing with exact size.  But just like any file share you do not want to have all your eggs in one basket, I typically install a 2TB NFS Datastore, I'll let hanccoca chime in but 10TB datastore sounds excessive
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.