plq
asked on
Create Statistics - how to decide what to create
I raised another thread yesterday (still ongoing) where angell gave me the idea of using filtered indexing in sql 2008. Thats going well.
I would like to get some understanding of how you decide which statistics to create. Here's some thoughts on possible strategies:
1. Just blindly run database engine tuning advisor and create the stats it suggests. When I previously followed this approach with indexes I ended up with mostly fast queries but sql was sometimes using the wrong indexes and the index size was abou 12GB. Now my index size is about 3.5GB as I created all the indexes manually - now all my queries are subsecond but I distrust database engine tuning advisor for those reasons.
2. Create statistics covering every column mentioned in each index - so the stats become a copy of the indexes effectively
3. Create statistics covering every column mentioned in each query - so the stats list off the columns in each sql statement in your "workfile"
4. Some other strategy not listed above ?
I just need some understanding of how you guys approach the problem of which stats to create. How do you decide ??
thanks
Paul
I would like to get some understanding of how you decide which statistics to create. Here's some thoughts on possible strategies:
1. Just blindly run database engine tuning advisor and create the stats it suggests. When I previously followed this approach with indexes I ended up with mostly fast queries but sql was sometimes using the wrong indexes and the index size was abou 12GB. Now my index size is about 3.5GB as I created all the indexes manually - now all my queries are subsecond but I distrust database engine tuning advisor for those reasons.
2. Create statistics covering every column mentioned in each index - so the stats become a copy of the indexes effectively
3. Create statistics covering every column mentioned in each query - so the stats list off the columns in each sql statement in your "workfile"
4. Some other strategy not listed above ?
I just need some understanding of how you guys approach the problem of which stats to create. How do you decide ??
thanks
Paul
I suggest you use "Missing Indexes" from the excellent SQL tool Performance Dashboard reports which works on SQL 2008 as well:
http://blogs.technet.com/b/rob/archive/2009/02/18/performance-dashboard-reports-for-sql-server-2008.aspx
http://blogs.msdn.com/b/vascov/archive/2008/09/30/using-performance-dashboard-with-sql-server-2008.aspx
http://blogs.technet.com/b/rob/archive/2009/02/18/performance-dashboard-reports-for-sql-server-2008.aspx
http://blogs.msdn.com/b/vascov/archive/2008/09/30/using-performance-dashboard-with-sql-server-2008.aspx
Besides all the above, indexes needs maintenance(REBUILD is an ON-LINE operation) and stats needs refresh so you can schedule either a
exec sp_updatestats;
at db level or
update statistics table_name;
at table level issue
http://blogs.msdn.com/b/jorgepc/archive/2007/12/09/how-to-check-fragmentation-on-sql-server-2005.aspx
exec sp_updatestats;
at db level or
update statistics table_name;
at table level issue
http://blogs.msdn.com/b/jorgepc/archive/2007/12/09/how-to-check-fragmentation-on-sql-server-2005.aspx
ASKER
These indexes are complex, supporting multiple column accesses and very complex query plans.
So would it therefore not be prudent to create statictics to help sql server ensure that it selects the correct index for each query ?
The only stats which sql creates for itself are single column stats - not useful when you have multiple indices where the first columns are identical ?
I appreciate your help but i feel that the above does not completely answer the question.
What I really need is to understand how and when a dba decides to specifically create stats and what the criteria they use are for supporting that decision.
thanks
So would it therefore not be prudent to create statictics to help sql server ensure that it selects the correct index for each query ?
The only stats which sql creates for itself are single column stats - not useful when you have multiple indices where the first columns are identical ?
I appreciate your help but i feel that the above does not completely answer the question.
What I really need is to understand how and when a dba decides to specifically create stats and what the criteria they use are for supporting that decision.
thanks
I have auto create statistics on in all my database, Also i relay on Actual execution plan and used to check the Actual rows returned and the estimated rows returned, If they differ I used to update the statistics or create statistics on the filtered\Join columns. Also most of the time Actual execution plan gives you a warning when a useful statistics were not found.
Preferably you should have statistics on the aggregate columns and group by columns, Join columns and Filtered columns.
Remember Statistics are not used efficiently if you use function on your columns.
Preferably you should have statistics on the aggregate columns and group by columns, Join columns and Filtered columns.
Remember Statistics are not used efficiently if you use function on your columns.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
thanks for helping. Turns out sql creates multi column stats for columns covered by an index, so in my case with the right indexes in place additional stats were not necessary.
Dont ALWAYS trust the DTA, It may suggest you the duplicate indexes, before creating the indexes check whether suggested index keys already exists in the table.