Link to home
Start Free TrialLog in
Avatar of michaelcrack
michaelcrack

asked on

Install SQL cluster on VMware ESX

Hello, I am investigating virtualising our company infrastructure. The project is moving quite quickly and I need to decide on a strategy for our SQL 2000 cluster. I want to virtualise this but people have heard of nightmarish situations arising. From the documentation I have been reading it seems that I can do it as long as there is not too much processor intensive or disk IO intensive activity.

My question is, can this be done and, if yes, what would be classified as 'high cpu or disk IO utilization'? I need to find some thresholds to compare with that offer solid acceptable throughput ranges.

I hope this question is clear.
Avatar of chapmandew
chapmandew
Flag of United States of America image

Are you talking about having all of the cluster resource nodes on the same VM machine?
What's your current configuration for your SQL servers? That would give an idea of how intensive your cpu and i/o requirements are. Are they on a SAN?
VMware has a specific document that outlines the support of MSCS on ESX.  This document discusses Clustering Virtual Machines on One Physical Host, Clustering Virtual Machines across Physical Hosts and Clustering Physical and Virtual Machines.  

http://www.vmware.com/pdf/vi3_35/esx_3/vi3_35_25_u1_mscs.pdf

There are specific requirements and caveats outlined.  I recommend that you review this in detail.
In every case that you need to virtualize a server, you should definately know the requirements of the server that you want to virtualize. By requirements I mean CPU, Memory, DISK IO, Network IO.

The most common issue is DISK IO and in most cases it is not VMware setting the limitation but misconfiguration of the hardware due to that the DISK IO requirement numbers is unknown. Which then results in bad performance.

So my strong advice would be to look into measuring the requirements for your SQL cluster - before you decision whether to virtualize it or not.

Some of the counters that you need to benchmark is:

Ave Disk Bytes/Transfer
Avg Disk Queue Length
Disk Transfers/sec

And the longer time that you benchmark the more precise values would you get. These numbers could you use to calculate the number of disks required for your SQL server / cluster. You need to take RAID penalty into account.

VMware partners have access to tools to help you determine these numbers. Otherwise perfmon could be used.

Regards
Heino
VMware Authorized Consultant
Out of curiosity, what do you gain by putting a windows cluster on a VMWare machine?  The whole idea of a cluster is to have separate machines to fail over to in the event that a machine fails.  So, if you have your cluster on a VMWare machine, and the machine goes down...what do you gain?
Chapmandew, that is a really good question.

VMware's cluster solution only protects you on the hardware. MSCS is also making it easier to patch your server by failing over the resources. The question is if you want the last feature?

So why virtualize. Well by virtualizing you would gain DR benefits as it is much easier to do backup and restores of those machines if you extend your backup solution with an image level backup of your operating system. You're are by doing this not dependent on which hardware to buy.

Also VMware is soon releasing Site Recovery Manager to fully protect a main site by creating a DR site on a remote location. A clustered SQL running on VMware would be protected by this product if needed.

I agree with you that MS clustering on VMware is provided very small benefits.
So, if I were to install a windows cluster right now on 2 virtual instances and put SQL SErver on them...I really do get any benefit from doing it, as they'd both go down if the machine failed, right?
I would not recommend putting a cluster on the same box - but across boxes....
I gotcha.  Thank you for explaining.  I just wanted to make sure I understood it correctly.
Avatar of michaelcrack
michaelcrack

ASKER

Hello all, thanks for the many replies. We are implementing ESX server to virtualise approximately 30 servers at first, the other servers will follow. I am connected to a san, the current SQL set up uses the san as quorum. The idea is to use 3 nodes in the ESX cluster to allow for redundancy.

I am aware that the 2 individual VM's in the SQL cluster would have to reside on different nodes in the ESX cluster. I have also been benchmarking the SQL server cluster for a few days. What I need to know is - are there some figures that I can compare these benchmarks to? IE. Something that states that if your disk IO or CPU utilisation is above 'X', then you should avoid virtualising.

Stappmeyer, thanks for the link, I am reading the doc now.
Sorry to double-post but FYI, we have about 15 databases, 1 is 20GB (monitoring software - sentinel) and the next biggest one is 4GB. From my benchmarks, IO is not very high but I need concrete thresholds in order to justify the move to management. The choice is being queried by a colleague who has 'heard somewhere' that it can be a bad choice. My thoughts are that this would apply to a massive database with thousands of client connections but I need to find out for sure in order to justify it in the project plan.

Thanks again to all.
What are your IO numbers from your measuring?

Typically a 10K RPM disk will do 130 IOPS, and a 15K RPM will achieve 180 IOPS.

Since were using RAID, theres a penalty that depends on the RAID level. Assuming there are 3 reads for every write, the penalty factor for RAID 5 is 0.57 and 0.8 for RAID 1 (or 0+1).

Finally, heres the formula well use:

Total IOPS = #Disks x IOPS/Disk x RAID Penalty factor

So lets put 5 15K RPM disks in a raid 5

5x180x0.57=513

So those 5 disks would provide you with Total IO of 513... If your measuring is higher than this - then you need to add more disks.

The above is an example.

Remember on Database servers you would split the DB and LOGs on seperate storage LUNs, so you need to take this into account as well.
Hi HeinoSkov,

Thank you for the information, judging from your information above (I am working on getting you results) I just want to clarify, the databases are on the SAN and will remain on the SAN, my query pertains specifically to whether I can virtualise the OS. Do you see a problem with virtualising the operating system?

I would have thought that the SQL instance on the os would have a relatively low overhead and the overhead would be on the SAN where it has always remained.

P.S. The LUN is set to RAID 5
Average disk queue length on E: is 0.001 and on C: is 0.035
% Processor Time is 2.904
Any suggestions on this query?
I have done some serious investigation into this and it appears as though opinions are split between those who think MSCS on esx with HA creates too many issues and others that believe that HA does not support application-level support so does not cover service failure within the OS.

Has anyone done an implementation like this or decided against it? If so can I ask for any input that may be helpful in making my decision. I am happy with the resource usage and we are well within the capabilities although this would be our only MSCS on the VI. Perhaps its better to keep this cluster in a stand-alone environment if the business are not willing to accept the change in service level?
ASKER CERTIFIED SOLUTION
Avatar of michaelcrack
michaelcrack

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial