We help IT Professionals succeed at work.

SQL Server 2008 bottlenecks

cp30
cp30 asked
on
Hi there,

I have a MSSQL Server 2008 R2 database running on a dedicated server (windows 2008) with 1 CPU and 6 cores and 2 disks that are software mirrored.

I have been having some issues with SQL performance and the last one seemed to be related to lots of waiting processes that referenced wait type of IO_COMPLETION.  Does this indicate a disk problem?

I'm not sure what I should be monitoring, I've tried adding the Physical Disk Queue Length and this fluctuates between 0 and 10 up and down pretty consistently, does this indicate a problem? or should I be monitoring something else?

Any help greatly appreciated, thanks.

Comment
Watch Question

Chris MangusDatabase Administrator
CERTIFIED EXPERT

Commented:
Here's a good article on understanding Physical Disk Queue Length.

http://msdn.microsoft.com/en-us/library/ms175903.aspx

It may not indicate a problem.  What is the performance problem you are actually having?

Author

Commented:
Well, some select statements are taking a long time but only sometimes and it's proving difficult to get to the bottom of what is causing the occasional delay.  The database has lots of records written to tables constantly and this seems to be running pretty well but the analysis side of the database seems to be slowing down at times and I can't seem to find any obvious issues, I think all neccessary tables have approriate indexes etc.

The IO_COMPLETION issue became ibvious this evening when 1 table was inaccessible and when I checked there was a stored procedure than does the following every 30 mins to update a table accessed by my web application which is refreshed every 30 minutes

SET NOCOUNT ON;

    -- Insert statements for procedure here
	BEGIN TRAN
	
		TRUNCATE TABLE tblstaticlist
		
		INSERT INTO tblstaticlist
		(
			fields,...
		)

		SELECT 
		        fields,...
		FROM vwstaticlist_union WITH (NOLOCK)
		
	COMMIT TRAN

Open in new window



and the truncate table has been waiting for over 5 minutes (the view should only take a few seconds to run) with IO_COMPLETION wait type, and no statements against that table were being allowed, I had to kill the process and a few others and after a while I was able to delete all records and access the table again, seemed very odd which led me to look at disk usage.

Hope this helps you understand more.

Thanks

Author

Commented:
Hi, after reading that article I've checked disk time and this seems to peak at 100% for a few a second quite often, does this, coupled with the fact that I have current disk queue lengths of between 1 and 10 mean that there's a disk issue or is that too simplistic?

I know a little SQL, normalization, sound design etc. etc. but I am not fantastic on performance etc (as you may have gathered!).  This is a very small business running in spare time so I am the IT dept! ;), what are my options if I wanted to hire a SQL guru to look over my server/database and make recommendations  on performance improvements? I'd be happy to pay someone but not sure where I'd get someone, any ideas?

Thanks for your help

Author

Commented:
Oh dear, it just happened again, I was watching the perf monitor after the table became unavailable and disk time was just sat at 100% constantly??? I've attached a screenshot, the Blue line is the "% Disk Time", hard to see as it's constantly at the top of the chart.  

When this happens I have a few processes in activity monitor SUSPENDED with wait type of IO_COMPLETION but other tables in the db are accessible, just seems this one table is the issue.

This time it seems that a TRUNCATE on the table seemed to resolve the issue

What on earth could be suddenly causing this to keep happening?

Looking forward to some suggestions...

Author

Commented:
Attachment as promised....
perfmon.jpg
Chris MangusDatabase Administrator
CERTIFIED EXPERT

Commented:
In the activity monitor do any of the processes indicate that any blocking is taking place?  If so you may want to modify the process to detect and connections to tblstaticlist and kill them before you run the truncate.

I'm hesitant to chalk it up to "disk activity" without knowing a bit more.  You indicated that there is an analysis side to the database.  Are you doing both active transactions and reporting / BI against the same database?

Your graph does show some disk stats that don't look too hot.  Where is the % Processor Time line?  Is it at the top of the graph?  I can't see it in the jpg.

Author

Commented:
Hi,

I cannot see anything indicating blocking in the processes in activity monitor,  which column would mention blocking as I haven't seen it? I want people to be able to carry reads of tblstaticlist without getting any errors or seeing blank results (which is why I wrapped the truncate and insert in a transaction, is this the right thing to do?), I'm guessing if I started killing killing processes then this could cause problems for my users, no I think of it, the only selects on that db will be with NOLOCK option which is probably why I'm seeing no locking.

Yes, I have one database which stores the data which is constantly being updated and in the same database it analyses the data, is this not sensible, should I create a 2nd database for running views etc, pulling data etc?

Processor is the red line (looks a little darker than the legend) maxing out at just over 60% so don't think this is a problem

Thanks for your help, I'm really struggling here!
Chris MangusDatabase Administrator
CERTIFIED EXPERT

Commented:
All the way to the right on the Activity Monitor window is two columns, one called Blocking and one called Blocked By.  If there is anything in the Blocking column then we have a blocked process.

If you're refilling the tblstaticlist table each time and don't want to block any reads that's going to be a pretty tough thing to accomplish.  You either can do it the way you're doing now, you could create a tmp_tblstaticlist table and switch it out with you main tblstatislist, or do updates to tblstaticlist.  At some point users are probably going to see some blank data no matter which way you go.  

The risk of doing the truncate outside the transaction is that you can't roll back if the insert fails so I understand why you structured it like that.  However, the process isn't working so you might as well try putting the truncate outside the transaction.

In general it's best to separate OLTP (daily transactional type things) workloads on different hardware than OLAP (reporting / BI) workloads.

Author

Commented:
Hi,

Thanks, I'll look out for locks, when it happens again.

Don't think I can do the truncate outside as it will cause problems if they see blank table, maybe I'll have to look at doing a merge and see how that performs.  Just seems strange that the truncate and insert inside the transaction was working for last week with no problems, now getting this weird problem that increases disk time to maximum.  Maybe the truncate didn't cause it and that's a red herring, anyway something is doing something strange to that table and disk activity???

At the minute I can't really stretch to new hardware for OLAP, is there any benefit to moving OLAP processes to a different DB on the same hardware or is that pretty pointless?

Thanks again for your input....



Chris MangusDatabase Administrator
CERTIFIED EXPERT

Commented:
If you're using the same disk spindles you aren't going to get much bang for the buck.

If the SQL workloads haven't changed then there may be a process outside SQL that is just contending for disk IO.  Is there anything that might fit this bill?

You definitely can look at MERGE to see if you can rewrite the query...

Author

Commented:
Yes, only got the 2 drives that are mirrored does not seem that much point creating a OLAP db then.

No, noting other than SQL on the box really so don't think it's anything else contending for the disk I/O.  And if it was, I'd expect it to affect the performance of the database as a whole not just access to one table?
Database Administrator
CERTIFIED EXPERT
Commented:
True on the disk IO, to a point.  If this is a large table or widely used it's poor performance could just be an indicator of other trouble.  But, in your case I think you are right.

If you want, let's explore trying to change the workflow.  How about looking at a MERGE query instead of recreating the table every 30 minutes?

Author

Commented:
Yes, I've implemented the merge, it takes a while longer but that shouldn't be an issue hopefully.  I've added written as below..... I dumped the records to a temp table first to try and avoid any locking on source data.

BEGIN TRY
	   BEGIN TRANSACTION    -- Start the transaction
	   	   
		-- INSERT DATA INTO TEMP TABLE
			SELECT 			
				fields
			INTO #temp20 FROM 
			vwstaticlist_union_new WITH (NOLOCK)
		
		   
MERGE tblstaticlist_20 AS t
USING #temp20 AS s
ON t.aggregatedKey = s.aggregatedKey and t.outcomeTypeId = s.outcomeTypeId
WHEN MATCHED THEN
  UPDATE SET
			field = s.field....
			
	
WHEN NOT MATCHED BY TARGET THEN
  INSERT (
			fields....	
  )
  VALUES (
 
			s.values...	
  )
	WHEN NOT MATCHED BY SOURCE THEN
		DELETE;
			
			COMMIT
    
	END TRY
	
	BEGIN CATCH
	    ROLLBACK
			 
		 INSERT INTO tbllog([type],[detail],[source])
			VALUES('error','ERROR: spUpdate_20: ' + ERROR_MESSAGE(),'db')

	  --RAISERROR(@ErrMsg, @ErrSeverity, 1)
	END CATCH

Open in new window


That look ok?

Cheers
Chris MangusDatabase Administrator
CERTIFIED EXPERT

Commented:
Well, I don't know all your logic but I recognize well written code when I see it.  Looks pretty elegant to me.  :)

Author

Commented:
Lol - elegant maybe, I'm just not sure it always does what it's meant to.  I'm really struggling at the minute as I'm an application developer by trade and my new part-time business venture has taken off somewhat and it's starting to get outside my areas of expertise regarding sql as you can probably tell by my silly questions ;-)

There's obviously parts of my code that I need to modify before I post and trace's etc that would probably help people see what's going on but I can't really post them publicly.  Is there any way I can hire SQL gurus from this site (or anywhere else for that matter) for a more private discussion and possibly hands on assistance in exchange for payment?  
Chris MangusDatabase Administrator
CERTIFIED EXPERT

Commented:
Many experts here on EE are for hire.  I do hire out as well.  My private email is up on my profile page.  If you want to go that route I'd say let's see what I can do for you with the MERGE as it would probably be the way I would go in my own production environment.  We can discuss it privately there but should post the eventual solution here, minus any proprietary details.

If I can guide you through the MERGE via email I won't charge.  Wouldn't feel right doing that.

Explore More ContentExplore courses, solutions, and other research materials related to this topic.