Celebrate National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

select statman killing performance

Posted on 2008-10-04
20
Medium Priority
?
3,880 Views
Last Modified: 2011-10-19
My problem emerges after a maintenance plan runs.  I think the plan blows away the statistics or something, but not everytime it runs (only sometimes--maybe every tenth time).  The maitenance plan runs once a month.  

The issue affects only one stored procedure.  The produre is fairly normal, but it does contain a cursor, uses the goto statement and has single temp table.

I read that I should remove all temp tables and dynamic sql from the stored procedure.  I did this (except for the one temp table I couldn't get rid of) and it didn't work.  The only thing that seems to work is time, i.e., after about three days everything goes back to normal.  The query goes from taking 300 MINUTES to run back down to its normal 30 minutes.

already 24 hours into this :-(
0
Comment
Question by:tomandlis
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
20 Comments
 

Author Comment

by:tomandlis
ID: 22640876
Sorry, I forgot to include what I see in sql profiler.  I see the following type of message repeatedly an about five or six tables.  There seems to be no pattern in the tables in that statman is updating statistics for.  All the tables are well structured: they have primary keys, clustered indexes and regular indexes, constraints etc.

--***********************
--***********************
SELECT StatMan([SC0],[SC1]) FROM (SELECT TOP 100 PERCENT [i] AS [SC0],[j] AS [SC1] FROM [dbo].[clusterYD] WITH
(READUNCOMMITTED,SAMPLE 1.000000e+002 PERCENT)  ORDER BY [SC0],[SC1]) AS _MS_UPDSTATS_TBL
--***********************
--***********************
SELECT StatMan([SC0],[SC1],[SC2]) FROM (SELECT TOP 100 PERCENT [i] AS [SC0],[distance] AS [SC1],[j] AS [SC2] FROM
[dbo].[clusterYD] WITH (READUNCOMMITTED,SAMPLE 1.000000e+002 PERCENT)  ORDER BY [SC0],[SC1],[SC2]) AS _MS_UPDSTATS_TBL
--***********************
--***********************
SELECT StatMan([SC0]) FROM (SELECT TOP 100 PERCENT [i] AS [SC0] FROM [dbo].[clusterYV] WITH (READUNCOMMITTED,SAMPLE
1.000000e+002 PERCENT)  ORDER BY [SC0]) AS _MS_UPDSTATS_TBL
--***********************
--***********************
etc...
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 22641109
could you try to add the WITH RECOMPILE option to the stored procedure?
0
 
LVL 21

Expert Comment

by:mastoo
ID: 22641232
If that doesn't fix it, I would capture the "slow" execution plan, and then in a few days compare it to the fast plan.  Along with looking at the difference between them, I would guess the slow plan is going to show some area with a dramatic difference between estimated and actual row count.  This can give you insight into what is going wrong and what can be done to fix it.
0
Migrating Your Company's PCs

To keep pace with competitors, businesses must keep employees productive, and that means providing them with the latest technology. This document provides the tips and tricks you need to help you migrate an outdated PC fleet to new desktops, laptops, and tablets.

 

Author Comment

by:tomandlis
ID: 22642306
The WITH RECOMPILE looks very promising.  I'm running the test right now.  I saw this article and it describes my sproc almost exactly.  I looks like SQL has cached bad query plan.  See the link below for more info

http://articles.techrepublic.com.com/5100-10878_11-5662581.html
0
 

Author Comment

by:tomandlis
ID: 22645005
Sorry, WITH RECOMPILE didn't work and I'm looking for new suggestions.  Procedure time is still 300 minutes (versus 30).

I did see something in the profiler that might be a clue.  It appears that the WITH RECOMPILE is only recompiling the statistics on the procedure (not the underlying tables).  It appears that sql is gathing stats on almost every table in the query.  Thus I am seeing the lines shown below in SQL Profiler.  

Profiler is repeatedly showing SQL Server is sampling the tables 'clusterYD', 'clusterYNN', 'clusterYV'.  These tables are what i would call 'table variables' more that tables.  That is, they describe a data structure but only abstractly.  By 'abstract', I mean sometimes when I run this procedure those tables might have tens of thousands of records and other times they might have just a few hundred.  I am guessing that this is the source of the problem, i.e., the statistics are being saved for the case where there were just a few hundred and now I'm running the case where there are tens of thousands and thus the stats are invalid.

I am guessing these underlying tables need to have their stats recompiled, but WITH RECOMPILE doesn't accomplish that.

I am going to try blowing away all the tables that STATMAN is sampling and recreating them from scratch.  There is probably an easier way to do this, but its all I can think of.


SELECT StatMan([SC0],[SC1]) FROM (SELECT TOP 100 PERCENT [i] AS [SC0],[j] AS [SC1] FROM [dbo].[clusterYD] WITH (READUNCOMMITTED,SAMPLE
1.000000e+002 PERCENT)  ORDER BY [SC0],[SC1]) AS _MS_UPDSTATS_TBL

SELECT StatMan([SC0],[SC1]) FROM (SELECT TOP 100 PERCENT [j] AS [SC0],[i] AS [SC1] FROM [dbo].[clusterYD] WITH (READUNCOMMITTED,SAMPLE
1.000000e+002 PERCENT)  ORDER BY [SC0],[SC1]) AS _MS_UPDSTATS_TBL

SELECT StatMan([SC0],[SC1],[SC2]) FROM (SELECT TOP 100 PERCENT [i] AS [SC0],[distance] AS [SC1],[j] AS [SC2] FROM [dbo].[clusterYD] WITH
(READUNCOMMITTED,SAMPLE 1.000000e+002 PERCENT)  ORDER BY [SC0],[SC1],[SC2]) AS _MS_UPDSTATS_TBL

SELECT StatMan([SC0],[SC1],[SC2]) FROM (SELECT TOP 100 PERCENT [distance] AS [SC0],[i] AS [SC1],[j] AS [SC2] FROM [dbo].[clusterYD] WITH
(READUNCOMMITTED,SAMPLE 1.000000e+002 PERCENT)  ORDER BY [SC0],[SC1],[SC2]) AS _MS_UPDSTATS_TBL

SELECT StatMan([SC0]) FROM (SELECT TOP 100 PERCENT [i] AS [SC0] FROM [dbo].[clusterYNN] WITH (READUNCOMMITTED,SAMPLE 1.000000e+002 PERCENT)  
ORDER BY [SC0]) AS _MS_UPDSTATS_TBL
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 22645006
Interesting query... the readuncommitted is certainly interesting, and the use of tablesample. If you want a random selection can also use new_id().

Think that it only really works on 2005 (and upwards) because of the way table statistics are now calculated - so might need to check compatability level...

Can imagine that after an optimisation it will take a while to get a random sample set. So, not sure that the recompile will help in this case...

From books online :

You can use TABLESAMPLE to quickly return a sample from a large table when either of the following conditions is true:
  - The sample does not have to be a truly random sample at the level of individual rows.
  - Rows on individual pages of the table are not correlated with other rows on the same page.

And that last point is what is probably hurting....
0
 

Author Comment

by:tomandlis
ID: 22645221
well, it appears that my attempt to hardcode individual tables (as opposed to using a generic table structure) is failing.  profiler is still loaded with 'statman' statements during execution.

mark_wills: I am selecting a random sample of rows.  The query I am using is shown below, would you please explain how I would rewrite the query?

      select top 3 val
      from dbo.[clusterYVstat_history]
      order by newid()
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 22645374
Well, there are examples in TABLESAMPLE : http://technet.microsoft.com/en-us/library/ms189108(SQL.90).aspx which does it a bit differently,

A while back, after some frustrations and investigation on a very large table found this link in msdn, have a good look at : http://msdn.microsoft.com/en-us/library/cc441928.aspx   -- particularly the stats down the bottom using the different techniques.


0
 

Author Comment

by:tomandlis
ID: 22645493

OK, some more information.  I plugged in the TABLESAMPLE function and got rid of the newid() to select a random sample of rows, but it appears that that isn't going to do the trick as STATMAN continues to fill the Profiler trace.  FYI, here is a description of how to use the TABLESAMPLE function:

http://technet.microsoft.com/en-us/library/ms189108(SQL.90).aspx

Hardcoding all the temp tables didn't do it either (they aren't really 'temp' tables, they are just the abstract tables that i referred to earlier.  Hardcoding the tables just resulted in STATMAN statements running on the new tables.

I also tried getting rid of the cursor in the sproc.  I switched to a 'cursorless' cursor setup which is described at the link below.  This didn't work either.

http://www.eggheadcafe.com/PrintSearchContent.asp?LINKID=529
http://www.eggheadcafe.com/articles/20010823.asp


I do have some information that might be helpful however.  I checked an old trace I had of the proc (when it was running correctly) and there is NO DIFFERENCE.  That is, the old trace was filled up with STATMAN statements too and yet it ran in 30 minutes versus 300 (as it is running now).  This leads me to believe there is something else as the root cause, i.e., it isn't STATMAN.  

As I mentioned at the beginning of this thread the performance issue began after a DB backup.  I am going to try to repeat the DB backup process and see if that does anything.  If that doesn't work I'm clueless.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 22645563
Can we see what STATMAN is doing ?
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 22645574
And I should also apologise, because I thought that the code you had in your first post was part of a stored procedure, not the results of the PROFILER....

Can we please see what you are actually doing ?
0
 

Author Comment

by:tomandlis
ID: 22648604
well as quickly as it started it has stopped.  The sproc is part of a nightly job and after one week of running at 300 minutes suddenly the execution time dropped back down to 30 minutes.  I'll try to do a log forensic and let you know what happened.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 22649215
Still don't want to share that proc ? By the looks of the outputs it is (obviously) updating statistics, and traditionally see it associated with auto-update-statistics, and/or part of update statistics in a maintenance plan.  My original thought was you were trying to replicate that activity in your stored proc...

But if it is a nightly procedure, then would like to see what you are running. IE do you have statistics plans in place, creating collections, updating, dropping, using dbcc or t_sql or using the stored procedures, are you checking stats_date etc etc.

AngelIII and mastoo might be able to continue, but I cannot really advise one way or the other unless I know what is in that stored procedure.

0
 

Author Comment

by:tomandlis
ID: 22659134
Mark, thanks for your help with this.  I'd share the proc, but it would likely just confuse the matter.  It's not doing anything to update SQL stats--it is merely caching some business reports during the off-peak hours so as to save processor power during the peak hours.   The reports are statisical in nature (they contain a cluster analysis), but the cluster analysis in the reports and SQL Server intrinsic stats are entirely unrelated.

I tried to analyze the logs, but I can see no differences between the good runs (30 minute runs) and the bad runs (300 minute runs).  I'm not really sure what I should be looking for, but I think I'm supposed to be looking in the 'reads' column for excessive reads.  Is that right?  
0
 
LVL 51

Assisted Solution

by:Mark Wills
Mark Wills earned 2000 total points
ID: 22659667
Then I would suggest you have auto stats turned on and the first few times after the "clean" it is continually looking at the stats plan given the nature of the data being accessed and has a small fit while trying to do auto -stats updtes. It would also explain why you are seeing statman. Though would also expect to see it in normal operation as well.

Slow or long elapsed time is not necessarily Disk IO alone, there can be many factors and really need to gather some statistics during that "heavy" period to see what all the resources are doing. Having said that, then if there is huge numbers of IO activity during that period, and generally speaking IO is the slowest part of the system, then yes, it is a worthy indicator (but that is all it is unless backed by stats).

Could it be coincidental that the clean up is periodic (ie monthly) and the SP is reporting stats that may take into account month-end type reporting/calculations/bulk processing (etc) ?

Might still be worth having a quick look at that SP...

I would suggest you need to look at CPU activity and Memory activity as well as IO activity. There are some good articles and guides in MSDN such as :
http://technet.microsoft.com/en-au/library/ms166575(SQL.90).aspx and http://technet.microsoft.com/en-au/library/ms187830(SQL.90).aspx  and http://www.microsoft.com/technet/prodtechnol/sql/2005/tsprfprb.mspx  (going from "lightest" to "heaviest").

Performance can be a big thing, so best read up a little, get some perfmon stats. There are even "Health Check" tools from Microsoft to help you :
http://www.microsoft.com/downloads/details.aspx?familyid=eedd10d6-75f7-4763-86de-d2347b8b5f89&displaylang=en





0
 

Accepted Solution

by:
tomandlis earned 0 total points
ID: 22676729
Mark, thanks again.  I think I can rule out coincidence as the problem persists for days and then suddenly stops.  I have autostats turned on already.

I think it must be that the stats on the tables are blown away during the maintenance process.  Immediately after the process I run the proc it runs with one set of parameters/data and that builds the table stats in such a way that the tables will run very slowly with a different set of parameters data.  

The problem self-corrects as long as auto-stats is turned on because  after running the proc for a while with a different sets of data/parameters results in stat updates and that results in quicker runs.

workaround: I hardcoded three different procs and each deals with its own set of parameters and data.  i also harcoded the proc to their own set of calculation tables.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 22680546
Just a thought, but there are considerations for auto-stats, and might want to manage it a bit differently...

Might be worth considering specifically / deliberately updating those statistics, and possibly now turning off auto-stats.
Then use a regular procedure to update "known" statistics / query plans. Auto stats are great when building or sampling a database, but not as desirable when there is some "database maturity" in terms of size and distribution of keys...  

Also, in 2005 there is a new feature : The ALTER DATABASE option AUTO_UPDATE_STATISTICS_ASYNC - meaning it will fire off in background, whilst the query that would caused the auto-update uses the original (old) query plan. and on large tables, a plan only really becomes "out of date" if row sampling changes by about 20%.

So, it there is reasonable and distributed activity, possibly doesn't make too much difference. If there are lots of "additions" so long as it reflects same type of pattern, then the query plans are porbably OK, if there are lots of deletes and changes, then maybe index defrags are required.

Becuase you notice it in a SP after a full optimisie, then it could be the query parameters themselves. If the parameter values of SP's change at time of running, then it is likely that the query plan has been optimised using different statistical basis, and therefore invokes the "heavy" auto-stats, so, changing stats to ASYNC mode will help, but in reality, you probably already have an optimised plan which query optimiser is ignoring becuase the parameters have changed... This is often the case when a stored procedure could be run for a user-requested period, or if null, runs for the default period. That flexibility is great, but really need to have the stored procedure using a "known" plan regardless of parameters. It is called parameter sniffing. Yep, there is a term for it. Sometimes the recompile will help (hence the previous suggestions), but really does not address the cardinality of the (key) values used with the parameter (because it's real use is largely unknown).

Which is probably the reason why your workaround is working for you... So, it might not be a workaround as much as "better" stored procedure practices to avoid pparameter sniffing.
0
 

Author Comment

by:tomandlis
ID: 22685428
Thanks again Mark. You are correct about the parameter sniffing.  The proc runs entirely differently when its parameters change--both internally, e.g., if-else, and the amount of data that it pushes around in the tables.
0
 

Author Comment

by:tomandlis
ID: 22737090
FYI, after more analysis I found the problem had nothing to do with STATMAN or SQL Server Statistics and was entirely due to a poor design decision by yours truly.  I tried to repurpose some code and it had the unintended consequence of adding many runs to the queue.  The flaw caused the process to run 10,000 times instead of about 300 times.  Once I corrected the design flaw and unloaded the queue the process went back to its usual run times.  How embarassing.
0
 
LVL 51

Expert Comment

by:Mark Wills
ID: 22737156
And how very gallant of you to share and enlighten !!  

The vast majority of performance issues are code based / application (see below), and the most gratifying thing is that you not only found it, but could also fixed it. Well Done.
performance-gains.jpg
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In SQL Server, when rows are selected from a table, does it retrieve data in the order in which it is inserted?  Many believe this is the case. Let us try to examine for ourselves with an example. To get started, use the following script, wh…
In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …
We’ve all felt that sense of false security before—locking down external access to a database or component and feeling like we’ve done all we need to do to secure company data. But that feeling is fleeting. Attacks these days can happen in many w…

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question