Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 108
  • Last Modified:

I keep getting this error: Windows Cluster Service has become unavailable (temporary or lost quorum); every morning from my 2012 clustered VM's any ideas?

[External] SQL Server Alert System: 'Windows Failover Cluster Service unavailable/AG failover occurred' occurred on XXXXXX

DATE/TIME:      2/19/2015 7:57:42 AM

DESCRIPTION:    (None)

COMMENT:        One or both of the following has occurred:

1. Windows - Windows Cluster Service has become unavailable (temporary or lost quorum);
2. SQL Server - Availability Group lost connection or AG failover has occrurred.

Please check and take appropriate actions.

JOB RUN:        (None)

There are 4 VMware VM's in a SQL 2012 cluster - OS is 2012 as well.  Keep getting these error every morning.  Any ideas?
0
Harper McDonald
Asked:
Harper McDonald
  • 5
  • 3
1 Solution
 
ste5anSenior DeveloperCommented:
Checking whether option 1. or 2. is true??
0
 
Harper McDonaldAuthor Commented:
This is happening on multiple clusters in our environment - Get-ClusterLog generates cluster.log but there is nothing that shows why / what...Event logs don't really say much just that the quorum as been lost.  Our network team has looked into it and nothing in three logs.  We are running in a FlexPod environment with UCS / netapps and Nexus switches so it's up to date gear.  Didn't know if someone might be having the same issue or a solution.  These are VMware clusters and the NIC's all have the VMXNET3 driver for the NIC.
0
 
ste5anSenior DeveloperCommented:
Three logs sounds like not that much, or do you have log consolidation?

You need to correlate all logs using the timestamp from the above error message +/- a meaningful grace period...
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Harper McDonaldAuthor Commented:
We have tried even with SQL logs and in verbose mode.   It's very strange - We have even increased the VM resources.
0
 
ste5anSenior DeveloperCommented:
Does it happen at the same time or is there any other pattern?
0
 
Harper McDonaldAuthor Commented:
It usually happens very early in the mornings but not really much of a pattern.  I need to get with the backup admin and see if jobs run on specific clusters at that time...might bring some light.
0
 
Vitor MontalvãoMSSQL Senior EngineerCommented:
The AG depends on the Windows cluster so the 2nd error should be derived from me first one.
The quorum it's the only resource that is shared so I would check with the storage guys what's happening with that disk.
0
 
Harper McDonaldAuthor Commented:
Removed vNIC and reinstalled on cluster nodes.
0
 
Harper McDonaldAuthor Commented:
It fixed the problem.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 5
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now