Avatar of bntech
bntech
Flag for United States of America asked on

Quorum Disk

I have a two node sql cluster,
When quorum disk aka witness disk is on node 1 along with all other services, and i kill node one, it doesnt failover. I might not be waiting long enough here

When node 1 has the quorum and node two has the other services, and I fail node 2, it works just fine, as the quorum disk never went offline.

It seems like the quorum disk doesnt move to the live node.

Is this true? Am I not waiting enought time? what is the best option for a two node clister. I have node and disk majority configured.


Windows Server 2008Microsoft SQL Server

Avatar of undefined
Last Comment
CMatSD

8/22/2022 - Mon
myrotarycar

Silly question to kick things off: did you setup your quorum disk to have your two nodes as possible owners?
bntech

ASKER
yep
myrotarycar

  1. Any indications of failure on the event log?
  2. After you kill node one, which resources successfully transition to node 2?
  3. Can you show me the advanced settings of Q? i.e. see screenshot


Untitled-picture1.png
Your help has saved me hundreds of hours of internet surfing.
fblack61
myrotarycar

Also, if this is not in production yet...what happens if you literally bounce your first node?
Ian Meredith

How a Cluster deals with a Quorum under Windows 2008 is very different to how to worked under Windows 2003....

See here for further clarification...http://technet.microsoft.com/en-us/library/cc770620%28WS.10%29.aspx

Consider which 'Mode' your quorum is operating as.....
bntech

ASKER
This is indeed a Windows 2008 cluster, and  my mode is set as

Node and Disk Majority: Each node plus a designated disk in the cluster storage (the “disk witness”) can vote, whenever they are available and in communication. The cluster functions only with a majority of the votes, that is, more than half.


From what i read, and i may be wrong, it is n-1 if disk witness is offline.
So
Node and Disk Majority
Can sustain failures if 1 node(s) with the witness disk online
Can sustain failures of 0 node(s) if the witness disk goes offline or fails.

I am confused why the quorum disk didnt failover, unless it didnt meet the timeout period, the whole cluster failed when the node with the quorum disk failed.

It works and fails over when the quorum disk stays online.
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
ASKER CERTIFIED SOLUTION
Ian Meredith

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
NOVBusApps

Neither of those actually explain why the quorum disk didn't fail over, unless it's really down to luck. If I read this right, it means that if you're lucky, the node that fails isn't the one that also happens to be hosting the quorum disk. If you're not, and the failed node is hosting the quorum disk, then both the node and the disk are down and you're explaining to your soon to be ex-boss that High-Availability really means cross your fingers!
CMatSD

Neither of them did explain it. Here is what you need to do - the quorum disks time to failover is set (as standard) to 15 minutes - set it down to a more useful number (such as 30 secs). In the Failover Cluster Manager choose Storage and right click on your Quorum disk. Choose Properties and then the Policy tab. Set the time to restart the resource as 30 secs and the attempts as 1. Make sure that the 'If restart is unsuccsseful then fail over all resources or services in this application' is ticked.

When the node with the quorum dies now then after 30 secs the quorum fails over onto the other node and it starts bringing any services that were on the failed node.