Avatar of atlasdev
atlasdev
 asked on

Question about hard drive performance in a VMWare environment

I am using vmware to come up with estimates for future database upgrades. I started using HDTune to come up with a sliding scale in order to come up with ballpark estimates across different hardware. The issue I am having is that the metric I am getting for the virtual server doesn't make sense to me. The setup I have here is a Dell Poweredge 2950 (dual quad cores - E5310) with 32GB of RAM with a MD3000 direct attached storage array (6 750GB 7200RPM SATA disks in a RAID 5 container with 1 hot spare) running VMWare ESX 3.5. Only running one virtual machine (2003 standard with 512MB RAM, an os drive of 15GB and a data drive of 65GB with nothing on it), I ran HDTune and have the results attached as an image. For approx most of the test, I get a measurement around 200MB/s. However, during the test the performance spikes up to around 800MB/s and spikes the virtual CPU to 100%. As a compariosn, I ran the same test on a poweredge 2950 with 7 146GB 10K RPM SAS drives in a raid 5 configuration with a  hotspare, which gave me a consistent 300MB/s measurement running WIndows 2003. I have a hard time believing the number I am getting from the vmware server and it may be that the issue may be hdtune. What is the best way for me to either troublehoot what is going on with this vm server?
hdtune.jpg
VMwareStorage

Avatar of undefined
Last Comment
atlasdev

8/22/2022 - Mon
za_mkh

Well in VMWare environments using SAN, I don't know how relevant these types of tools are. However the ability to test the SAN / disk subsystems we normally use IOMeter and thats what we used to work out what we needed to do. This is a very long and difficult topic - but if you do a search for this, you will see how complex it can be.
An example of such http://virtualgeek.typepad.com/virtual_geek/2009/06/vmware-io-queues-micro-bursting-and-multipathing.html
Of course, you are using VMWare Server so it's not the same but the methodology is similar
za_mkh

One thing, SAS will outperfom SATA on any given day hence you getting better results in your second test.
Your test is also not equal the SAS drives are 10K whereas the SATA drives are only 7.2K RPM - this also results in a performance differential ...
atlasdev

ASKER
The issue I have is that due to the difference of our client's hardware, giving accurate downtime for a database update is difficult and why I went down this path. If SATA drives give me 60% performance in comparison to a similar set up with SAS, I can at least estimate the performance difference and that was what I originally set out to accomplish.

However, what I am getting is that system with SAS drives is giving me 300MB/s consistent with no CPU spikes. The SATA array on the VMware server is giving me the graph posted consistent 100MB/s or so, but for approx 20% of the test it spikes to 800MB/s with inclusive CPU spike which is why I posted the question - this spike makes no sense from a performance perspective as this tells me that 20% of the time I am going to get almost 3 times better performance than SAS.

The link you posted makes sense and I understand the complexity - however does it apply as much in this case since this is a direct attached storage array and not an iSCSI or FC storate array?
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
ASKER CERTIFIED SOLUTION
andyalder

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
atlasdev

ASKER
This would make sense. Currently only the esx server is connected to the MD3000, and according to the controller there's no policy on read, but at this point I am assumint that's the controller on the PE2950 and not the controller on the MD3000 which is probably a different setting. I'll look into changing this and seeing if it really is the issue here (apparently there's no CLI I can use to change this setting in the ESX environment).