Solved

High latch wait time on SQL 2008 cluster

Posted on 2012-03-28
4
852 Views
Last Modified: 2016-11-23
I'm seeing an issue on my production db cluster. It's a SQL 2008 R2 cluster. It is connected to a Dell Equallogic SAN. The servers in the cluster connect to the SAN via the MS iSCSI initiator.  
I notice the latch wait time is always between 500ms and 3000ms. This is the case even when the server is not processing much. The guide value for this counter from MS is <300ms. I ran a query which showed me most of the waits are on the buffer latch_class.
I have another DB cluster connected to the same SAN, and latch wait time is usually 0, but very occasionally going above this.  I also have a backup environonment, where the same code and databases are present. Here, the latch wait time is always 0 too. The only difference from this environment is that there is no cluster, and it using local storage.
The end sympotoms for end users are that processing time is slow between the app server and the db server.
When I run perfmon, disk queue length looks ok. Any other areas where the bottleneck may be?
0
Comment
Question by:sherryfitzgroup
  • 3
4 Comments
 
LVL 39

Assisted Solution

by:lcohan
lcohan earned 500 total points
ID: 37778062
0
 
LVL 2

Accepted Solution

by:
sherryfitzgroup earned 0 total points
ID: 37805008
The problem ended up being with the application server that was writing to the db. I narrowed it down by setting up disk, cpu and memory counters on the DB server. I could see that there was no bottleneck anywhere that coincided with the latch waits.
I took a look at the SAN for the app server, and could see that there were a lot of vms on the same link to the SAN, and the response time was quite slow (used free storage response monitor). So I moved the appp vm to a different datastore, and also changed the SAN so that each datastore had a different preferred path/controller.
0
 
LVL 2

Author Closing Comment

by:sherryfitzgroup
ID: 37820960
Awarding points, as lcohan was the only one to help me.
0
 
LVL 2

Author Comment

by:sherryfitzgroup
ID: 37968931
I only realised after a month that I hadn't resolved the long latch time by moving the App server.
The actual solution was down to SAN paths. I only had one active path to the SAN from the DB server. I configured SAN multipathing, with Dell Equallogic HIT kit.
Now I have 2 active paths, and my latch time is down to about 150ms.
0

Featured Post

Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

Join & Write a Comment

Create your own, high-performance VM backup appliance by installing NAKIVO Backup & Replication directly onto a Synology NAS!
Shadow IT is coming out of the shadows as more businesses are choosing cloud-based applications. It is now a multi-cloud world for most organizations. Simultaneously, most businesses have yet to consolidate with one cloud provider or define an offic…
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now