[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

Very High Current Disk Queue Spikes

Posted on 2007-11-26
13
Medium Priority
?
1,763 Views
Last Modified: 2013-11-14
We have a Managed SQL server 2000 cluster running against a SAN disk set. At occasional peroids we seem to suffer a very high number of SQL Deadlocks and performance issues. I decided to monitor the SQL server and during this peroid I was monitiring Avg Disk Queue Length, Current disk queue length., Disk bytes/sec, SQL transactions/sec and a few others. Now normally the Disk Q sits around .05 to.5 but at certain times I noticed the the current queue length would climb to around 850+ over 25 seconds with no significant increase in bytes/sec or trans/sec. Could this be a SAN config issue? What areas can I investigate considering this is a Managed server and only have restrcited remote access to the server. I can provide a screenshot of the spike. The support company say the stats are irrelvant and the issue is with the application and they can't see anything with the system.
0
Comment
Question by:dusty_nz
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 3
  • 2
  • +2
13 Comments
 
LVL 32

Assisted Solution

by:Brendt Hess
Brendt Hess earned 120 total points
ID: 20354879
Can't say that this applies in this case, but I ran into a very similar symptom recently where some SQL DB files were in a compressed directory.  At least check to ensure that this is not the case for you.
0
 

Author Comment

by:dusty_nz
ID: 20354901
Heres a couple of pictures showing the spikes

http://kiwiland.net.nz/download/diskqSpike1.jpg
http://kiwiland.net.nz/download/diskqSpike2.jpg

The second clarely shows that the server is not really under any increased load.
0
 
LVL 12

Assisted Solution

by:kselvia
kselvia earned 60 total points
ID: 20355256
Assuming you have enough RAM most of your reads should come from cache.

I would focus on tempdb.  How many tempdb files do you have? Are they on different logical disks?

Do you do a lot of SELECT * into #temptable in your code?

That will put a schema lock on the current tempdb file. If you have only one tempdb file, it will block all other SELECT INTOs until it completes.  At that time, you may see the spikes.

It is best to have one tempdb file for each CPU.

It's also not a good idea to SELECT  INTO for large result sets.

To prevent schema locks on tempdb, either pre-create the temp table and INSERT into it, or populate the temp table in 2 steps.  
1) SELECT * into #temptable from Mytable where 1 = 2 (this will create an empty structure thus avoiding extended schema locks)  
2)  INSERT #temptable SELECT * from Mytable

Or run SQL profiler and capture the events that are occuring when you see these spikes.
0
Fill in the form and get your FREE NFR key NOW!

Veeam® is happy to provide a FREE NFR server license to certified engineers, trainers, and bloggers.  It allows for the non‑production use of Veeam Agent for Microsoft Windows. This license is valid for five workstations and two servers.

 

Author Comment

by:dusty_nz
ID: 20355275
The database does do some big selects however it has 8 gig ram and max CPU gets to around 10%. all the other perf stats like Cache hits etc are good. Its these occasionaly spikes where it looks like the disk is unable top be written to. This seems to be primarily on the Data drive and the log drive is not impacted.
At other times we can put the system under a lot more load and its happy. My guess is its a SAN config issue??
0
 
LVL 32

Assisted Solution

by:Brendt Hess
Brendt Hess earned 120 total points
ID: 20355321
Raid-5 data drive?  Raid-0, 1, or 10 log drive?

Is there anything that would do a mass Update of a large number of records?

I would not expect the CPU to spike on whatever is causing this, since the disk is orders of magnitude slower than CPU or RAM.

When I see this sort of pattern, I first suspect massive data changes that cause multitudes of records to be rewritten.  What might be causing that, I cannot say, but large updates of clustered indexes, causing mass splitting, might cause this on a large table on a RAID-5 drive set.
0
 

Author Comment

by:dusty_nz
ID: 20355326
However we are not seeing an increase in bytes/sec or in transactions/sec. Also we have had way bigger days and nothing like this.
0
 
LVL 56

Accepted Solution

by:
andyalder earned 135 total points
ID: 20357210
Is this SAN dedicated to you or shared with other servers? It looks that the SAN may be slowing down during these spiles but that doesn't mean there is anything wrong with it, just that other servers are hitting it heavily at that time. Is there any guaranteed performance on your LUNs from the hosting centre?
0
 
LVL 35

Assisted Solution

by:David Todd
David Todd earned 60 total points
ID: 20367913
Hi,

kselvia:
Another expert has suggested that your comment about select into #someTempTable no longer applies

I've openned a question to discuss this:
http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/Q_22988582.html

Cheers
  David
0
 

Author Comment

by:dusty_nz
ID: 20368862
Thanks for everyones contribution. The issue was actually that we had SAN issues but wanted a 3rd point of view. The manahging company (Big WA one) had actually build the san that had a multi partitioned Lun. Two partitions containing the db and logs (V Bad) and the third partition containing the exchange server. Also they were running a Hitachi (HS9000?) which actually has issues with resource contention na dis known to lockup when pushed. We were actually seeing points for almost 25 seconds where the SAN was processing no I/Os at all. Causing transactions to stall incompleted and then raise the issue of deadlocks. Putting loggins we actually saw over 20 cases per day when the Disk Qhit over 6000. The managing company resfused to accept they may have a SAN issue.
0
 

Author Comment

by:dusty_nz
ID: 20368923
reposted for bad spelling. Sorry
Thanks for everyones contribution. The issue was actually that we had SAN issues but wanted a 3rd persons point of view. The managing company (Big WA one) had actually configured the SQL using a SAN partition sharing both Datana dlogs on the same raidset as well as a very big corporate exchange server. On the same raid set. Two partitions containing the db and logs (V Bad) and the third partition containing the exchange server. Also they were running a Hitachi (HS9000?) SAN which actually has issues with resource contention and is known to lockup when pushed. We were actually seeing periods of almost 25 seconds where the SAN had stopped processing any disk requests at all. This was causing transactions to stall incompleted and then raise the issue of deadlocks. Putting logging in we actually saw over 20 cases per day when the Disk Q hit over 6000. The managing company refused to accept they may have a SAN issue and claimed that table locks were causing disk locks.
0
 
LVL 56

Expert Comment

by:andyalder
ID: 20369237
So I was right that you had a disk I/O problem (it's obvious from the 4th spike on your second pic) but you gave me a B grade. I presume the first red spike on that one is you using the mouse to start perfmon. You can see disk writes/sec=0 even though the queue goes up. If it was a BMP I'd draw a circle around the evidence with mspaint and post it on ee upload site.
0
 

Author Comment

by:dusty_nz
ID: 20369304
Yeah. Sorry Andy. I felt it was a bit of a trick question and wasn't actually looking for an answer but more getting a broad point of view.

I do appreciate your answer and it was indeed correct however I gave points to everyone. (You got the most by far) Unfortunitly you can only give one grade for everyone.

0
 
LVL 56

Expert Comment

by:andyalder
ID: 20369822
It was a really good question and I don't mind about grades anyway, MS could add it to their list of how to interpret perfmon exams. Not really about SQL though.

https://filedb.experts-exchange.com/incoming/ee-stuff/5869-flatline.jpg
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down …
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

649 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question