MSCS Clustering on VMWare on iSCSI SAN

Posted on 2011-10-07
Last Modified: 2012-05-12
I've come across a possible problem with 2 SQL MSCS 2003 clusters that were converted to VMs running on ESXi 4.0 with HP P4000 SAN. Basically one node was evicted from each cluster pre-conversion to leave us with two single node clusters. These were then converted straight (by a contractor), just as if regular servers as far as I can tell but they have worked fine as single node clusters (clustered disks are just vmdk files on a LUN).

During a support call recently a VMWare engineer pointed out settings on the clusters that he said were wrong. He provided a document that I have now gone through and I can find multiple inconsistencies on our clusters when compared to the supported configurations in VMWare's document ( These include the alarming fact that apparently our SAN is not supported in any MSCS setup as it is iSCSI. The servers have thin provisioned disks, SCSI bus sharing set to None, only one SCSI controller and Memory Overcommit enabled.

Can anyone comment on whether all of the above settings (including lack of support for iSCSI SANs) still apply in a single node MSCS cluster situation? As you still have a quorum and are still reliant on the Cluster service I would presume the restrictions still apply but haven't found anything that confirms either way.

Does anyone also know whether there is any MSCS configuration that would allow snapshots? The reason for the original support call to VMWare was that the clusters were snapshotted (though failed) by Backup Exec after the selection list was changed in error. It basically killed the clusters and we had to build fresh VMs and restore C and System State from backup and then reattach the clustered disks. I’d love to know why exactly the snapshot operation killed the clusters rather than just failing - if anyone can shed any light on that I’d be really grateful.

Thanks in advance for any advice.

Question by:chrisstenson
    LVL 116

    Accepted Solution

    Its possible that your Cluster configuration may not be a "supported"  configuration by VMware or Microsoft. But the question to ask is whether your cluster configuration works for you, you understand and support it, and investigate "What Risk to your organisation this un-supported configuration is".

    We have many clients, which use Microsoft's iSCSI Initiator inside the VM, attached to iSCSI LUNs, for the Exchange Datastore, which are also clustered with 2003, we also believe this is another non-supported VMware configuration, but our Clients do not often call VMware for support on Microsoft products, and most had to configure this was, because of NetApp's Snapdrive and SnapManager for Exchange which needs to Microsoft iSCSI initiator to connect to the NetApp iSCSI LUN.

    The bottom line is MOST Cluster Configurations are only supported with RDM (RAW) LUNS.
    LVL 116

    Expert Comment

    by:Andrew Hancock (VMware vExpert / EE MVE)
    I would like to understand why snapshotting causes this VM to fail?
    LVL 42

    Expert Comment

    I have seen snapshots fail because the VM cannot get freed up IO since data is still transferring via iSCSI.  Check to see if you're using VSS, turn off VSS in VMWare Tools.  You should do snapshots on the SAN for the LUN and/or some type of backups.  Check for VSS errors in event log.

    LVL 116

    Expert Comment

    by:Andrew Hancock (VMware vExpert / EE MVE)
    @paulsolov: Yes, I've seen Snapshots fail, very common, and they've always failed since the beginning to time, but I've never seen the Snapshot function kill a VM! (okay, bad snapshot management - yes!)
    LVL 42

    Assisted Solution

    @hanccocka:  I've seen a few scenarios and the one you describe is most common but I have also seen snapshots freeze a VM and make it unusable.  I would also look at cleaning up the VMs to ensure that they don't have any legacy HP, Dell, etc.. applications that could cause weird thing to occur. VCP 5.0 today..going for NCIE


    Author Closing Comment

    Thanks for your help guys. We are setting up a test cluster to try to work out what causes the failure

    I spoke to an HP VMWare specialist today who says that running a single node cluster should be fine (so we can ignore the rules that normally apply to MSCS) as there is no other machine accessing the vmdk files. Which seems logical.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Threat Intelligence Starter Resources

    Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

    Suggested Solutions

    After we apply, or update, ESXi 5.5 we can get this warning in ESXi host: No coredump target has been configured. Host core dumps cannot be saved
    When upgrading from 5.5(in this case) to 6.0 and if you have an invalid vfat system(most of the times a coredump partition) upgrade will fail.
    Teach the user how to install log collectors and how to configure ESXi 5.5 for remote logging Open console session and mount vCenter Server installer: Install vSphere Core Dump Collector: Install vSphere Syslog Collector: Open vSphere Client: Config…
    Teach the user how to use vSphere Update Manager to update the VMware Tools and virtual machine hardware version Open vSphere Client: Review manual processes for updating VMware Tools and virtual hardware versions: Create a new baseline group in vSp…

    779 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    12 Experts available now in Live!

    Get 1:1 Help Now