Link to home
Start Free TrialLog in
Avatar of Aaron
AaronFlag for United States of America

asked on

Quorum drive failed for 2008 R2 SQL cluster

About an hour ago our Quorum drive on our cluster went into a "failed" state.  There are no errors in the event log leading up to the failure that offer any insight into the cause.  The setup is as follows:

2 X Server 2008 R2 with SQL 2008 in Active/Passive
Quorum drive is on a Xiotech 5000
All other shared resources reside on the same Xiotech and are functioning correctly.

There are no errors on the Xiotech and the volume appears to be fine according to their management software.  We attempted to bring the resource online in Cluster Failover Manager and it failed.  We tried to do the "repair" on the quorum drive and that failed as well.  We've brought down both servers and attempted these tasks with the same failures.  Any advice to bring the quorum drive back online would be greately appreciated.
Avatar of Paul MacDonald
Paul MacDonald
Flag of United States of America image

I wonder how the controller determines the drive is okay...

In any case, can you replace (even temporarily) the affected drive?  Format it, and treat it like it's your quorum drive, but empty?
Avatar of Aaron

ASKER

The controller (a Xiotech 9000) provides health of the volumes.  I also confirmed with Xiotech that the telemtry logs showed no errors during the most recent upload (20 minutes after initial failure).  Can you provide any details on the process for replacing the Quorum?  Right now it shows "reserved" from both nodes so I'm unable to do anything to it.  I can create and present a new volume, but I'm unsure what the process would be to use it as the quorum at that point.
Can you not simply replace the physical drive?  Or is this a virtual drive provisioned in a SAN?  If this is a provisioned drive, are you certain something about the provisioning hasn't changed?  Permissions, etc?

Are there any error messages on either cluster node?

Avatar of Aaron

ASKER

All shared resources are on the Xiotech 5000 SAN.  Nothing about the provisioning has changed.  The drive shows up in the MPIO software as well as in disk management.  The issue is the failed status in cluster manager.
ASKER CERTIFIED SOLUTION
Avatar of Aaron
Aaron
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Aaron

ASKER

No solutions were provided relevant to the issue.  I was able to find an outside resource that provided the solution.
Replacing the drive is what I was driving at.  I'm glad you were able to solve the problem.  

Still no idea what happened to the old drive?
Avatar of Aaron

ASKER

Sorry for the confusion; I thought you were referencing a physical disk error and replacing a hard drive.  I still haven't found the underlying issue.  I did find a few initiator errors on the Brocade fiber switch, but the WWN's didn't match up to the hosts that were having the issues.  I'm leaning towards an issue with the Xiotech, it would mark the 3rd failure in our Xiotech array in the past few months.  If I find a cause I'll update this for the knowledgebase.