Solved

High latch wait time on SQL 2008 cluster

Posted on 2012-03-28
4
873 Views
Last Modified: 2016-11-23
I'm seeing an issue on my production db cluster. It's a SQL 2008 R2 cluster. It is connected to a Dell Equallogic SAN. The servers in the cluster connect to the SAN via the MS iSCSI initiator.  
I notice the latch wait time is always between 500ms and 3000ms. This is the case even when the server is not processing much. The guide value for this counter from MS is <300ms. I ran a query which showed me most of the waits are on the buffer latch_class.
I have another DB cluster connected to the same SAN, and latch wait time is usually 0, but very occasionally going above this.  I also have a backup environonment, where the same code and databases are present. Here, the latch wait time is always 0 too. The only difference from this environment is that there is no cluster, and it using local storage.
The end sympotoms for end users are that processing time is slow between the app server and the db server.
When I run perfmon, disk queue length looks ok. Any other areas where the bottleneck may be?
0
Comment
Question by:sherryfitzgroup
  • 3
4 Comments
 
LVL 40

Assisted Solution

by:lcohan
lcohan earned 500 total points
ID: 37778062
0
 
LVL 2

Accepted Solution

by:
sherryfitzgroup earned 0 total points
ID: 37805008
The problem ended up being with the application server that was writing to the db. I narrowed it down by setting up disk, cpu and memory counters on the DB server. I could see that there was no bottleneck anywhere that coincided with the latch waits.
I took a look at the SAN for the app server, and could see that there were a lot of vms on the same link to the SAN, and the response time was quite slow (used free storage response monitor). So I moved the appp vm to a different datastore, and also changed the SAN so that each datastore had a different preferred path/controller.
0
 
LVL 2

Author Closing Comment

by:sherryfitzgroup
ID: 37820960
Awarding points, as lcohan was the only one to help me.
0
 
LVL 2

Author Comment

by:sherryfitzgroup
ID: 37968931
I only realised after a month that I hadn't resolved the long latch time by moving the App server.
The actual solution was down to SAN paths. I only had one active path to the SAN from the DB server. I configured SAN multipathing, with Dell Equallogic HIT kit.
Now I have 2 active paths, and my latch time is down to about 150ms.
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

These days, all we hear about hacktivists took down so and so websites and retrieved thousands of user’s data. One of the techniques to get unauthorized access to database is by performing SQL injection. This article is quite lengthy which gives bas…
The question appears often enough, how do I transfer my data from my old server to the new server while preserving file shares, share permissions, and NTFS permisions.  Here are my tips for handling such a transfer.
This tutorial will walk an individual through the process of installing the necessary services and then configuring a Windows Server 2012 system as an iSCSI target. To install the necessary roles, go to Server Manager, and select Add Roles and Featu…
This Micro Tutorial will teach you how to reformat your flash drive. Sometimes your flash drive may have issues carrying files so this will completely restore it to manufacturing settings. Make sure to backup all files before reformatting. This w…

685 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question