cp30
asked on
SQL Server 2008 bottlenecks
Hi there,
I have a MSSQL Server 2008 R2 database running on a dedicated server (windows 2008) with 1 CPU and 6 cores and 2 disks that are software mirrored.
I have been having some issues with SQL performance and the last one seemed to be related to lots of waiting processes that referenced wait type of IO_COMPLETION. Does this indicate a disk problem?
I'm not sure what I should be monitoring, I've tried adding the Physical Disk Queue Length and this fluctuates between 0 and 10 up and down pretty consistently, does this indicate a problem? or should I be monitoring something else?
Any help greatly appreciated, thanks.
I have a MSSQL Server 2008 R2 database running on a dedicated server (windows 2008) with 1 CPU and 6 cores and 2 disks that are software mirrored.
I have been having some issues with SQL performance and the last one seemed to be related to lots of waiting processes that referenced wait type of IO_COMPLETION. Does this indicate a disk problem?
I'm not sure what I should be monitoring, I've tried adding the Physical Disk Queue Length and this fluctuates between 0 and 10 up and down pretty consistently, does this indicate a problem? or should I be monitoring something else?
Any help greatly appreciated, thanks.
ASKER
Well, some select statements are taking a long time but only sometimes and it's proving difficult to get to the bottom of what is causing the occasional delay. The database has lots of records written to tables constantly and this seems to be running pretty well but the analysis side of the database seems to be slowing down at times and I can't seem to find any obvious issues, I think all neccessary tables have approriate indexes etc.
The IO_COMPLETION issue became ibvious this evening when 1 table was inaccessible and when I checked there was a stored procedure than does the following every 30 mins to update a table accessed by my web application which is refreshed every 30 minutes
and the truncate table has been waiting for over 5 minutes (the view should only take a few seconds to run) with IO_COMPLETION wait type, and no statements against that table were being allowed, I had to kill the process and a few others and after a while I was able to delete all records and access the table again, seemed very odd which led me to look at disk usage.
Hope this helps you understand more.
Thanks
The IO_COMPLETION issue became ibvious this evening when 1 table was inaccessible and when I checked there was a stored procedure than does the following every 30 mins to update a table accessed by my web application which is refreshed every 30 minutes
SET NOCOUNT ON;
-- Insert statements for procedure here
BEGIN TRAN
TRUNCATE TABLE tblstaticlist
INSERT INTO tblstaticlist
(
fields,...
)
SELECT
fields,...
FROM vwstaticlist_union WITH (NOLOCK)
COMMIT TRAN
and the truncate table has been waiting for over 5 minutes (the view should only take a few seconds to run) with IO_COMPLETION wait type, and no statements against that table were being allowed, I had to kill the process and a few others and after a while I was able to delete all records and access the table again, seemed very odd which led me to look at disk usage.
Hope this helps you understand more.
Thanks
ASKER
Hi, after reading that article I've checked disk time and this seems to peak at 100% for a few a second quite often, does this, coupled with the fact that I have current disk queue lengths of between 1 and 10 mean that there's a disk issue or is that too simplistic?
I know a little SQL, normalization, sound design etc. etc. but I am not fantastic on performance etc (as you may have gathered!). This is a very small business running in spare time so I am the IT dept! ;), what are my options if I wanted to hire a SQL guru to look over my server/database and make recommendations on performance improvements? I'd be happy to pay someone but not sure where I'd get someone, any ideas?
Thanks for your help
I know a little SQL, normalization, sound design etc. etc. but I am not fantastic on performance etc (as you may have gathered!). This is a very small business running in spare time so I am the IT dept! ;), what are my options if I wanted to hire a SQL guru to look over my server/database and make recommendations on performance improvements? I'd be happy to pay someone but not sure where I'd get someone, any ideas?
Thanks for your help
ASKER
Oh dear, it just happened again, I was watching the perf monitor after the table became unavailable and disk time was just sat at 100% constantly??? I've attached a screenshot, the Blue line is the "% Disk Time", hard to see as it's constantly at the top of the chart.
When this happens I have a few processes in activity monitor SUSPENDED with wait type of IO_COMPLETION but other tables in the db are accessible, just seems this one table is the issue.
This time it seems that a TRUNCATE on the table seemed to resolve the issue
What on earth could be suddenly causing this to keep happening?
Looking forward to some suggestions...
When this happens I have a few processes in activity monitor SUSPENDED with wait type of IO_COMPLETION but other tables in the db are accessible, just seems this one table is the issue.
This time it seems that a TRUNCATE on the table seemed to resolve the issue
What on earth could be suddenly causing this to keep happening?
Looking forward to some suggestions...
ASKER
Attachment as promised....
perfmon.jpg
perfmon.jpg
In the activity monitor do any of the processes indicate that any blocking is taking place? If so you may want to modify the process to detect and connections to tblstaticlist and kill them before you run the truncate.
I'm hesitant to chalk it up to "disk activity" without knowing a bit more. You indicated that there is an analysis side to the database. Are you doing both active transactions and reporting / BI against the same database?
Your graph does show some disk stats that don't look too hot. Where is the % Processor Time line? Is it at the top of the graph? I can't see it in the jpg.
I'm hesitant to chalk it up to "disk activity" without knowing a bit more. You indicated that there is an analysis side to the database. Are you doing both active transactions and reporting / BI against the same database?
Your graph does show some disk stats that don't look too hot. Where is the % Processor Time line? Is it at the top of the graph? I can't see it in the jpg.
ASKER
Hi,
I cannot see anything indicating blocking in the processes in activity monitor, which column would mention blocking as I haven't seen it? I want people to be able to carry reads of tblstaticlist without getting any errors or seeing blank results (which is why I wrapped the truncate and insert in a transaction, is this the right thing to do?), I'm guessing if I started killing killing processes then this could cause problems for my users, no I think of it, the only selects on that db will be with NOLOCK option which is probably why I'm seeing no locking.
Yes, I have one database which stores the data which is constantly being updated and in the same database it analyses the data, is this not sensible, should I create a 2nd database for running views etc, pulling data etc?
Processor is the red line (looks a little darker than the legend) maxing out at just over 60% so don't think this is a problem
Thanks for your help, I'm really struggling here!
I cannot see anything indicating blocking in the processes in activity monitor, which column would mention blocking as I haven't seen it? I want people to be able to carry reads of tblstaticlist without getting any errors or seeing blank results (which is why I wrapped the truncate and insert in a transaction, is this the right thing to do?), I'm guessing if I started killing killing processes then this could cause problems for my users, no I think of it, the only selects on that db will be with NOLOCK option which is probably why I'm seeing no locking.
Yes, I have one database which stores the data which is constantly being updated and in the same database it analyses the data, is this not sensible, should I create a 2nd database for running views etc, pulling data etc?
Processor is the red line (looks a little darker than the legend) maxing out at just over 60% so don't think this is a problem
Thanks for your help, I'm really struggling here!
All the way to the right on the Activity Monitor window is two columns, one called Blocking and one called Blocked By. If there is anything in the Blocking column then we have a blocked process.
If you're refilling the tblstaticlist table each time and don't want to block any reads that's going to be a pretty tough thing to accomplish. You either can do it the way you're doing now, you could create a tmp_tblstaticlist table and switch it out with you main tblstatislist, or do updates to tblstaticlist. At some point users are probably going to see some blank data no matter which way you go.
The risk of doing the truncate outside the transaction is that you can't roll back if the insert fails so I understand why you structured it like that. However, the process isn't working so you might as well try putting the truncate outside the transaction.
In general it's best to separate OLTP (daily transactional type things) workloads on different hardware than OLAP (reporting / BI) workloads.
If you're refilling the tblstaticlist table each time and don't want to block any reads that's going to be a pretty tough thing to accomplish. You either can do it the way you're doing now, you could create a tmp_tblstaticlist table and switch it out with you main tblstatislist, or do updates to tblstaticlist. At some point users are probably going to see some blank data no matter which way you go.
The risk of doing the truncate outside the transaction is that you can't roll back if the insert fails so I understand why you structured it like that. However, the process isn't working so you might as well try putting the truncate outside the transaction.
In general it's best to separate OLTP (daily transactional type things) workloads on different hardware than OLAP (reporting / BI) workloads.
ASKER
Hi,
Thanks, I'll look out for locks, when it happens again.
Don't think I can do the truncate outside as it will cause problems if they see blank table, maybe I'll have to look at doing a merge and see how that performs. Just seems strange that the truncate and insert inside the transaction was working for last week with no problems, now getting this weird problem that increases disk time to maximum. Maybe the truncate didn't cause it and that's a red herring, anyway something is doing something strange to that table and disk activity???
At the minute I can't really stretch to new hardware for OLAP, is there any benefit to moving OLAP processes to a different DB on the same hardware or is that pretty pointless?
Thanks again for your input....
Thanks, I'll look out for locks, when it happens again.
Don't think I can do the truncate outside as it will cause problems if they see blank table, maybe I'll have to look at doing a merge and see how that performs. Just seems strange that the truncate and insert inside the transaction was working for last week with no problems, now getting this weird problem that increases disk time to maximum. Maybe the truncate didn't cause it and that's a red herring, anyway something is doing something strange to that table and disk activity???
At the minute I can't really stretch to new hardware for OLAP, is there any benefit to moving OLAP processes to a different DB on the same hardware or is that pretty pointless?
Thanks again for your input....
If you're using the same disk spindles you aren't going to get much bang for the buck.
If the SQL workloads haven't changed then there may be a process outside SQL that is just contending for disk IO. Is there anything that might fit this bill?
You definitely can look at MERGE to see if you can rewrite the query...
If the SQL workloads haven't changed then there may be a process outside SQL that is just contending for disk IO. Is there anything that might fit this bill?
You definitely can look at MERGE to see if you can rewrite the query...
ASKER
Yes, only got the 2 drives that are mirrored does not seem that much point creating a OLAP db then.
No, noting other than SQL on the box really so don't think it's anything else contending for the disk I/O. And if it was, I'd expect it to affect the performance of the database as a whole not just access to one table?
No, noting other than SQL on the box really so don't think it's anything else contending for the disk I/O. And if it was, I'd expect it to affect the performance of the database as a whole not just access to one table?
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
Yes, I've implemented the merge, it takes a while longer but that shouldn't be an issue hopefully. I've added written as below..... I dumped the records to a temp table first to try and avoid any locking on source data.
That look ok?
Cheers
BEGIN TRY
BEGIN TRANSACTION -- Start the transaction
-- INSERT DATA INTO TEMP TABLE
SELECT
fields
INTO #temp20 FROM
vwstaticlist_union_new WITH (NOLOCK)
MERGE tblstaticlist_20 AS t
USING #temp20 AS s
ON t.aggregatedKey = s.aggregatedKey and t.outcomeTypeId = s.outcomeTypeId
WHEN MATCHED THEN
UPDATE SET
field = s.field....
WHEN NOT MATCHED BY TARGET THEN
INSERT (
fields....
)
VALUES (
s.values...
)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
COMMIT
END TRY
BEGIN CATCH
ROLLBACK
INSERT INTO tbllog([type],[detail],[source])
VALUES('error','ERROR: spUpdate_20: ' + ERROR_MESSAGE(),'db')
--RAISERROR(@ErrMsg, @ErrSeverity, 1)
END CATCH
That look ok?
Cheers
Well, I don't know all your logic but I recognize well written code when I see it. Looks pretty elegant to me. :)
ASKER
Lol - elegant maybe, I'm just not sure it always does what it's meant to. I'm really struggling at the minute as I'm an application developer by trade and my new part-time business venture has taken off somewhat and it's starting to get outside my areas of expertise regarding sql as you can probably tell by my silly questions ;-)
There's obviously parts of my code that I need to modify before I post and trace's etc that would probably help people see what's going on but I can't really post them publicly. Is there any way I can hire SQL gurus from this site (or anywhere else for that matter) for a more private discussion and possibly hands on assistance in exchange for payment?
There's obviously parts of my code that I need to modify before I post and trace's etc that would probably help people see what's going on but I can't really post them publicly. Is there any way I can hire SQL gurus from this site (or anywhere else for that matter) for a more private discussion and possibly hands on assistance in exchange for payment?
Many experts here on EE are for hire. I do hire out as well. My private email is up on my profile page. If you want to go that route I'd say let's see what I can do for you with the MERGE as it would probably be the way I would go in my own production environment. We can discuss it privately there but should post the eventual solution here, minus any proprietary details.
If I can guide you through the MERGE via email I won't charge. Wouldn't feel right doing that.
If I can guide you through the MERGE via email I won't charge. Wouldn't feel right doing that.
http://msdn.microsoft.com/en-us/library/ms175903.aspx
It may not indicate a problem. What is the performance problem you are actually having?