Aaron
asked on
Quorum drive failed for 2008 R2 SQL cluster
About an hour ago our Quorum drive on our cluster went into a "failed" state. There are no errors in the event log leading up to the failure that offer any insight into the cause. The setup is as follows:
2 X Server 2008 R2 with SQL 2008 in Active/Passive
Quorum drive is on a Xiotech 5000
All other shared resources reside on the same Xiotech and are functioning correctly.
There are no errors on the Xiotech and the volume appears to be fine according to their management software. We attempted to bring the resource online in Cluster Failover Manager and it failed. We tried to do the "repair" on the quorum drive and that failed as well. We've brought down both servers and attempted these tasks with the same failures. Any advice to bring the quorum drive back online would be greately appreciated.
2 X Server 2008 R2 with SQL 2008 in Active/Passive
Quorum drive is on a Xiotech 5000
All other shared resources reside on the same Xiotech and are functioning correctly.
There are no errors on the Xiotech and the volume appears to be fine according to their management software. We attempted to bring the resource online in Cluster Failover Manager and it failed. We tried to do the "repair" on the quorum drive and that failed as well. We've brought down both servers and attempted these tasks with the same failures. Any advice to bring the quorum drive back online would be greately appreciated.
ASKER
The controller (a Xiotech 9000) provides health of the volumes. I also confirmed with Xiotech that the telemtry logs showed no errors during the most recent upload (20 minutes after initial failure). Can you provide any details on the process for replacing the Quorum? Right now it shows "reserved" from both nodes so I'm unable to do anything to it. I can create and present a new volume, but I'm unsure what the process would be to use it as the quorum at that point.
Can you not simply replace the physical drive? Or is this a virtual drive provisioned in a SAN? If this is a provisioned drive, are you certain something about the provisioning hasn't changed? Permissions, etc?
Are there any error messages on either cluster node?
Are there any error messages on either cluster node?
ASKER
All shared resources are on the Xiotech 5000 SAN. Nothing about the provisioning has changed. The drive shows up in the MPIO software as well as in disk management. The issue is the failed status in cluster manager.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
No solutions were provided relevant to the issue. I was able to find an outside resource that provided the solution.
Replacing the drive is what I was driving at. I'm glad you were able to solve the problem.
Still no idea what happened to the old drive?
Still no idea what happened to the old drive?
ASKER
Sorry for the confusion; I thought you were referencing a physical disk error and replacing a hard drive. I still haven't found the underlying issue. I did find a few initiator errors on the Brocade fiber switch, but the WWN's didn't match up to the hosts that were having the issues. I'm leaning towards an issue with the Xiotech, it would mark the 3rd failure in our Xiotech array in the past few months. If I find a cause I'll update this for the knowledgebase.
In any case, can you replace (even temporarily) the affected drive? Format it, and treat it like it's your quorum drive, but empty?