Link to home
Start Free TrialLog in
Avatar of marrowyung
marrowyung

asked on

SQL server 2012 BPA and Baseline Configuration Analyzer 2.0

hi,

I know that there are tools called SQL server 2012 BPA. but what is Baseline Configuration Analyzer 2.0 for ?

https://www.microsoft.com/en-us/download/details.aspx?id=16475&ranMID=24542&ranEAID=TnL5HPStwNw&ranSiteID=TnL5HPStwNw-1rOq5KrHgd81zk7MZ0EFsA&tduid=(a1869e06b625aa1d46e1faf25a014ec9)(256380)(2459594)(TnL5HPStwNw-1rOq5KrHgd81zk7MZ0EFsA)()

for scanning SQL server or Windows ?

I'd like to know if this one can also help to detect any potential problem of our AOG group.
Avatar of Rich Weissler
Rich Weissler

> for scanning SQL server or Windows ?
Yep.  Windows, SQL, and/or a number of other Microsoft products.

> I'd like to know if this one can also help to detect any potential problem of our AOG group.
Unfortunately I don't see anything in the help files which are included in the BPA SQL 2012 download which specifically spell out checks against AOG or clustering.  (Replication gets it's own section though.)
Avatar of marrowyung

ASKER

ok, any tools can help to detect the health of  Windows cluster and Failover cluster which AOG build on top ?

"Yep.  Windows, SQL, and/or a number of other Microsoft products
."

I want to check why cluster service fail on the secondary node of the Windows cluster.
The BPAs will find places where best practices aren't being followed, and usually represent proactive steps to be taken to avoid failures.  

If the cluster service itself is failing, it probably isn't SQL.  When you attempt to start the service and it fails, are there associated events which are written to the windows event logs (I'd be looking at the system event log primarily, but with an eye on the application event log as well.  Might also be worth looking in Event Viewer under [Applications and Services Logs]/Microsoft/Windows/[FailoverClustering]*  logs.  Should be half a dozen or so, although not all of them will necessarily have events.
"If the cluster service itself is failing, it probably isn't SQL.  When you attempt to start the service and it fails, are there associated events which are written to the windows event logs (I'd be looking at the system event log primarily, but with an eye on the application event log as well.

yes, I dig it out but seems the some cluster resource failed to start,

"/Microsoft/Windows/[FailoverClustering]*  logs"

 there are failover cluster related only log ?

so for heathlness of cluster, I can relies only the event log ? I am looking for tools to make sure that cluster is healthy.
ASKER CERTIFIED SOLUTION
Avatar of Rich Weissler
Rich Weissler

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
"*confused*  Do you have two issues?  Cluster service failing on the secondary node, and some cluster resources failing on the primary node?  "

yes, we have an incident, we can failover to secondary replica and we patch primary replica, then we keep using secondary replica over the weekend, but disk full on secondary replica and we found out at that moment it can't fallback every manual or automatically. so everyone is blaming on AOG's capabliity.

so when I take a look on it,  on secondary replica, during that time, cluster failure message is here and seems some dependence resource is failed during that period.

so I expected because of this, secondary replica cant fallback to primary because the cluster don't even exists.

so this not AOG and it is the depending cluster service has problem, so we need a way to detect it and before that, I need a tools to scan that and see if there are MORE problem behind.

"You can use the Windows Failover Cluster Manager to confirm health to that level.."

how ?

" And yes, one of the ways the Cluster management tools confirms healthiness is to report on whether there are errors or warning related to the failover cluster."

how can I do that , how can I get this tools ?

"Any other tools which checks for cluster health will also check the event logs."

a tools ensure that already in your mind, what is it ? Idera DM ?

"A BPA is a Best Practice Analyzer.  It'll let you know if you're following best practices, not necessarily whether your system is healthy.  "

so this one can't check cluster related problem.
the conclusion from you is,  for cluster problem, we can only relies on failover cluster log from failover cluster management tools ?
hi,

do you think this tools can do the job once AFTER Cluster setup:

https://www.microsoft.com/en-us/download/details.aspx?id=8529
> the conclusion from you is,  for cluster problem, we can only relies on failover cluster log from failover cluster management tools ?

Nope.  The conclusion from me is that the failover cluster logs (as well as the Windows System and Application logs) are critical tools that you need to use first to help identify cluster problems.  

Once everything is functional, I'd encourage you to run the BPA for SQL 2012 as well as look at the BPA results for Windows Server.  They may help proactively identify issues.

> do you think this tools can do the job once AFTER Cluster setup:

If you are building a cluster on Windows 2003 or Windows 2000, yes.  Otherwise no.  
The equivalent tools are built into the Failover Cluster software in 2008+, and the link provided as well as the cluster validation in currently supported versions are usually run before you configure the cluster.  You can run at least a subset of the validation tools safely once the cluster is online as well.
Just to double check: Do you have event viewer loaded on your server, or is it running on core?  (You should also be able to load it on a desktop workstation and connect to the server.)  The Failover Cluster Management tool can be installed with the Failover Cluster feature -- it's the associated tool.  You can also obtain the tools by loading the appropriate RSAT tool set for your desktop management system.  (What version of Windows Server are you using?)
"Do you have event viewer loaded on your server,"

yes.

"is it running on core?  (You should also be able to load it on a desktop workstation and connect to the server.)  

what is run on Core means ?

" You can also obtain the tools by loading the appropriate RSAT tool set for your desktop management system.  (What version of Windows Server are you using?)
"

Windows 2012 R2.

what is appropriate RSAT tool set  ?

"Once everything is functional, I'd encourage you to run the BPA for SQL 2012 as well as look at the BPA results for Windows Server.  They may help proactively identify issues.:

will do it very soon but you said:

  Unfortunately I don't see anything in the help files which are included in the BPA SQL 2012 download which specifically spell out checks against AOG or clustering

"If you are building a cluster on Windows 2003 or Windows 2000, yes.  Otherwise no.  "

in Windows 2012, I guess I should right click on a cluster and select valid cluster, right? the same thing?

one thing , I just check with IT team, as I read white paper on how to work on a mulit-subnet/site SQL 2012 failover case and they talk about quorum. our team said starting from SQL 2014 Windows cluster don't need Quorum anymore, is that right? then how can SQL server 2012 in the AOG see the quorum and then vote for which one is the primary ?

our IT team only say starting from SQL 2014 AOG don't need quorum, is that right?

that's why from failover cluster manager I can't see the Quorum disk resource, right?
> What is run on Core means ?

Server Core is the the option to run current versions of Window Server operating systems without a GUI.  in 2012 R2, there's two levels of core -- Core with a GUI so you can run some of the GUI management utilities, and without a GUI.  Without a GUI, you'd either need to manage the server remotely... or know all the commandline/powershell commands to perform the operations you want.

> what is appropriate RSAT tool set  ?

Remote Server Administration Tools (RSAT) are the tools which would be loaded on an administrator's workstation to allow remote management of a server.  The Failover Clustering Tools are available in RSAT.

> will do it very soon but you said:

  Unfortunately I don't see anything in the help files which are included in the BPA SQL 2012 download which specifically spell out checks against AOG or clustering


Yes, that is what I said.  Knowing what what might be configured contrary to best practices in SQL, outside AOG or clustering, will still be of value.  I feared that you were taking what I was saying about looking in the event logs for what is causing the failover cluster failures to mean that you should only use the event logs, which I would strongly encourage you not to do.  SQL is part of your cluster, and a failure in some portion of SQL could well be received by the clustering service... (there is communication which takes place.)  Your earlier description of the cluster service itself failing, wouldn't be SQL though.  The failure of a node very much could be a failure of SQL (or a failure of another resource within the node.)

> in Windows 2012, I guess I should right click on a cluster and select valid cluster, right? the same thing?

Yes.  Before you run the validation tests, (a) be sure you understand what the tests are doing, and (b) realize there will likely be a performance hit on your system.  Don't run them during a peak time.

> one thing , I just check with IT team, as I read white paper on how to work on a mulit-subnet/site SQL 2012 failover case and they talk about quorum. our team said starting from SQL 2014 Windows cluster don't need Quorum anymore, is that right? then how can SQL server 2012 in the AOG see the quorum and then vote for which one is the primary ?

our IT team only say starting from SQL 2014 AOG don't need quorum, is that right?

that's why from failover cluster manager I can't see the Quorum disk resource, right?


I apologize.  You've now left my area of knowledge.  :-)  I'm not aware of a difference in the quorum requirements between SQL 2012 and 2014 -- and they don't require a quorum disk -- there are several quorum options.  I thought with a multisite, the quorum would usually be configured with a fileshare witness.  I would be surprised if that changed in SQL 2014 -- but like I said, it is outside of what I know.
"You can run at least a subset of the validation tools safely once the cluster is online as well."

so you mean AFTER the cluster setup,  we can do SOME TEST/subset of full test?

"Yes.  Before you run the validation tests, (a) be sure you understand what the tests are doing, and (b) realize there will likely be a performance hit on your system."

you mean this link: https://blogs.msdn.microsoft.com/clustering/2011/06/28/validating-a-cluster-with-zero-downtime/

shows us what the cluster validation check is going to do ?
> "You can run at least a subset of the validation tools safely once the cluster is online as well."
> so you mean AFTER the cluster setup,  we can do SOME TEST/subset of full test?


I mean exactly what I said.  AFTER the cluster is setup, you can run AT LEAST SOME TESTS to validate SAFELY.  There are some of the tests which are DISRUPTIVE.

> you mean this link: https://blogs.msdn.microsoft.com/clustering/2011/06/28/validating-a-cluster-with-zero-downtime/
> shows us what the cluster validation check is going to do ?


I mean that you should probably scan thru that one page blog post, if not read it carefully.  In brief, it provides basic information which will help you to validate a cluster while minimizing the chance that the validation will disrupt the cluster operations.  There are links in that blog post which will provide additional details about the individual checks.
tks, I will come back when I need more information .
tks.