MSCS Clustering on VMWare on iSCSI SAN

I've come across a possible problem with 2 SQL MSCS 2003 clusters that were converted to VMs running on ESXi 4.0 with HP P4000 SAN. Basically one node was evicted from each cluster pre-conversion to leave us with two single node clusters. These were then converted straight (by a contractor), just as if regular servers as far as I can tell but they have worked fine as single node clusters (clustered disks are just vmdk files on a LUN).

During a support call recently a VMWare engineer pointed out settings on the clusters that he said were wrong. He provided a document that I have now gone through and I can find multiple inconsistencies on our clusters when compared to the supported configurations in VMWare's document ( These include the alarming fact that apparently our SAN is not supported in any MSCS setup as it is iSCSI. The servers have thin provisioned disks, SCSI bus sharing set to None, only one SCSI controller and Memory Overcommit enabled.

Can anyone comment on whether all of the above settings (including lack of support for iSCSI SANs) still apply in a single node MSCS cluster situation? As you still have a quorum and are still reliant on the Cluster service I would presume the restrictions still apply but haven't found anything that confirms either way.

Does anyone also know whether there is any MSCS configuration that would allow snapshots? The reason for the original support call to VMWare was that the clusters were snapshotted (though failed) by Backup Exec after the selection list was changed in error. It basically killed the clusters and we had to build fresh VMs and restore C and System State from backup and then reattach the clustered disks. I’d love to know why exactly the snapshot operation killed the clusters rather than just failing - if anyone can shed any light on that I’d be really grateful.

Thanks in advance for any advice.

Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Its possible that your Cluster configuration may not be a "supported"  configuration by VMware or Microsoft. But the question to ask is whether your cluster configuration works for you, you understand and support it, and investigate "What Risk to your organisation this un-supported configuration is".

We have many clients, which use Microsoft's iSCSI Initiator inside the VM, attached to iSCSI LUNs, for the Exchange Datastore, which are also clustered with 2003, we also believe this is another non-supported VMware configuration, but our Clients do not often call VMware for support on Microsoft products, and most had to configure this was, because of NetApp's Snapdrive and SnapManager for Exchange which needs to Microsoft iSCSI initiator to connect to the NetApp iSCSI LUN.

The bottom line is MOST Cluster Configurations are only supported with RDM (RAW) LUNS.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
I would like to understand why snapshotting causes this VM to fail?
Paul SolovyovskySenior IT AdvisorCommented:
I have seen snapshots fail because the VM cannot get freed up IO since data is still transferring via iSCSI.  Check to see if you're using VSS, turn off VSS in VMWare Tools.  You should do snapshots on the SAN for the LUN and/or some type of backups.  Check for VSS errors in event log.

Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
@paulsolov: Yes, I've seen Snapshots fail, very common, and they've always failed since the beginning to time, but I've never seen the Snapshot function kill a VM! (okay, bad snapshot management - yes!)
Paul SolovyovskySenior IT AdvisorCommented:
@hanccocka:  I've seen a few scenarios and the one you describe is most common but I have also seen snapshots freeze a VM and make it unusable.  I would also look at cleaning up the VMs to ensure that they don't have any legacy HP, Dell, etc.. applications that could cause weird thing to occur. VCP 5.0 today..going for NCIE

chrisstensonAuthor Commented:
Thanks for your help guys. We are setting up a test cluster to try to work out what causes the failure

I spoke to an HP VMWare specialist today who says that running a single node cluster should be fine (so we can ignore the rules that normally apply to MSCS) as there is no other machine accessing the vmdk files. Which seems logical.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.