Making Sense of Index DMV's in SQL Server

There are some very powerful Dynamic Management Views (DMV's) introduced with SQL 2005.

The two in particular that we are going to discuss are sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats.

Recently, I was involved in a discussion as to what some of these views really do, and thought I might share some of that discussion in a relatively light hearted way. Considerably more light hearted than the aforementioned discussion...

Now, despite being able to find a lot of discussions about these views, these two in particular can cause confusion. Mainly due to the terminology used in naming these views. Notably, the words "index" and "usage" are a little ambiguous. You would think that those names imply what one would reasonably expect to get if using them, and, when you might use them. Think again.

At first glance they give information which doesn't quite make sense when compared side by side. One view says "this" many reads and the other says "that". One view has an index entry for a table and the other doesn't. Both have entries for tables that don't have indexes. So, what is going on ? Why are they so different ? Which happens to be the wrong question, the fact that the information doesn't match is a very important differentiator. Question should be "How can I take advantage of those difference".

The differences in the information from those views actually tell us important pieces of the index puzzle. Collectively, that information is very telling about your index designs (which will be a different post).

So, lets have a look at those views :

SYS.DM_DB_INDEX_USAGE_STATS only reports on indexes that have been used at least once since the server was last restarted. Here the word "used" really is ambiguous, because it is updated once each time the query plan is executed, not every time an indexed is actually used.

SYS.DM_DB_INDEX_OPERATIONAL_STATS reports on all indexes regardless of whether they have been used and is updated every time the storage engine executes a specific operation on the index during the execution of query plans. One could reasonably argue that this view is really the usage, because it does reflect every time an index is actually used.

Now there is a third very interesting DMV sys.dm_db_index_physical_stats, and can help identify index fragmentation. But that will be a new article.

Because operational stats more accurately reflects actual usage, it is easy to get these two confused. The way I remember how to use them is by their name (duh), but, with a small twist, as in "is my index useful" then usage, "is my index operating efficiently" then operational.

To demonstrate the differences, let's create a couple of test tables :

But first, to use the queries below it has to be in sequence, so open a new query window and paste the code. Highlight the code until you get to the next "go" and then run the queries (keyboard short cut F5).

When looking at the results, you will need to scroll right to see the columns being referred to (namely singleton_lookup_count and range_scan_count in the "operational" view). So please browse and scroll when we look at the results from the DMV's.

-- first a 'heap' table (i.e., one with no PK or clustered index)

IF object_id('tst_tbl_heap','U') is not null drop table tst_tbl_heap

CREATE TABLE tst_tbl_heap (ID int not null, Firstname varchar(60), SecurityNumber varchar(20))
go

-- now a 'clustered' table 

IF object_id('tst_tbl_indexes','U') is not null drop table tst_tbl_indexes

CREATE TABLE tst_tbl_indexes (ID int not null, Firstname varchar(60), SecurityNumber varchar(20))

ALTER TABLE tst_tbl_indexes ADD CONSTRAINT PK_tst_tbl_id PRIMARY KEY CLUSTERED(ID)
CREATE NONCLUSTERED INDEX idx_tst_tbl_SN ON tst_tbl_indexes(SecurityNumber) INCLUDE (firstname)
go

--OK, now lets see what we our views tell us:

select object_name(s.object_id),i.name, *
from sys.dm_db_index_usage_stats s 
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

-- we get nothing - why ? remember the name, are indexes useful ? Not yet...
-- but...

select object_name(s.object_id),i.name, *

from sys.dm_db_index_operational_stats (db_id(),0,NULL,0) s     -- NB: needs parameters !
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

-- does show us our two indexes, and even shows us a row for our table without an index. 
-- So, it is not just for indexes, despite the DMV name saying "indexes" huh ! But, do
-- you remember our name ? it is operational, but are they working ? the counts are zero.

-- lets now add some data and see what happens...

INSERT tst_tbl_heap(ID,FirstName,SecurityNumber) values (1,'Mark1','1234567')
INSERT tst_tbl_indexes(ID,FirstName,SecurityNumber) values (1,'Mark1','1234567')
go

-- now lets look again at our DMV's

select object_name(s.object_id),i.name, *
from sys.dm_db_index_usage_stats s 
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')

-- Now we get something ! and again, not just for indexes (why call them Index DMV)
-- Similarly for below we also return information, but we were already expecting that

select object_name(s.object_id),i.name, *
from sys.dm_db_index_operational_stats (db_id(),0,NULL,0) s
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

Because our focus is on Indexes, then our Heap Table has served it's purpose (i.e., non-indexed tables show up).

So, we wont worry about the heap table any more.

Few pointers if you want to continue to play with the Heap Table (please do)

1) Heaps will have an index number of 0 and no name.

2) the column stats are a bit different because they have no indexes

3) Usage stats are well populated regardless of being Heap.

4) Look in operational stats for the leaf_%_count columns, and row_lock% etc.

5) the lesson was that even a table without indexes is getting in those views.

6) some of the assertions or assumptions made below will apply, but take care.

7) Heaps with lots of writes over time might become fragmented.

8) Heaps with lots of reads might (especially if scans) benefit from indexing.

So, we say farewell to the humble heap and turn our attention to our index table

-- Now lets do another insert into our Index Table, this time, two rows at a time.

INSERT tst_tbl_indexes(ID,FirstName,SecurityNumber) 
SELECT 2 as id,'Mark2' as firstname,'2234567' as SecurityNumber union all
SELECT 3 as id,'Mark3' as firstname,'3234567' as SecurityNumber 

-- now lets look again at our DMV's

select object_name(s.object_id),i.name, *
from sys.dm_db_index_usage_stats s 
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')

-- we see usage stats increases user_updates by 1, i.e., now 2
-- that makes perfect sense because we have run 2 inserts right ?

select object_name(s.object_id),i.name, *
from sys.dm_db_index_operational_stats (db_id(),0,NULL,0) s
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

-- we see operational stats increases leaf_insert_count by 2, 
-- the reason being is that the last insert added 2 rows. 
-- the actual number of index operations from two inserts is now 3.

-- So, lets now see what happens with a select and see what the DMV's say

SELECT * from tst_tbl_indexes where ID in (1,2,3)

select object_name(s.object_id),i.name, *
from sys.dm_db_index_usage_stats s 
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')

-- for usage stats we get user_seeks=1 on the PK

select object_name(s.object_id),i.name, *
from sys.dm_db_index_operational_stats (db_id(),0,NULL,0) s
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

-- But our operational stats show singleton_lookup_count = 3
-- and also, we see a range_scan_count of 1

-- so let's now try another query, on say the other index...

SELECT * from tst_tbl_indexes where securitynumber = '1234567'

select object_name(s.object_id),i.name, *
from sys.dm_db_index_usage_stats s 
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')

-- we get user_seeks=1 on our index, but also a user_lookup on PK

select object_name(s.object_id),i.name, *
from sys.dm_db_index_operational_stats (db_id(),0,NULL,0) s
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

-- we now get range_scan_count = 2 on our idx, but no singleton_lookup count.
-- why does one DMV say scan, and the other say seek ? Another difference (there is a reason).
-- if it were a unique key on that column, we would in fact get a singleton_lookup.

SELECT securitynumber from tst_tbl_indexes where firstname = 'mark3'

select object_name(s.object_id),i.name, *
from sys.dm_db_index_usage_stats s 
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')

-- we get user_scans=1 on our index 

select object_name(s.object_id),i.name, *
from sys.dm_db_index_operational_stats (db_id(),0,NULL,0) s
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

-- This time we get range_scan_count on idx = 4, 
-- now the scan counts make no sense in comparing them at all.
-- lets repeat that same query, this time all columns, but first,
-- if you scroll to the right on operational_stats, take note of row and page locks

SELECT * from tst_tbl_indexes where firstname = 'mark3'
go

select object_name(s.object_id),i.name, *
from sys.dm_db_index_usage_stats s 
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

select object_name(s.object_id),i.name, *
from sys.dm_db_index_operational_stats (db_id(),0,NULL,0) s
inner join sys.indexes i on s.object_id=i.object_id and i.index_id=s.index_id
where object_name(s.object_id) in ('tst_tbl_indexes','tst_tbl_heap')
go

-- now we get range_scan_count and user_scans both incremented by 1
-- might even suggest that a select * is not always a bad thing :)
-- and those row locks ? didn't change - another why - maybe it was all in index ?
-- we have finished for now - plenty more could be done, but I think
-- we have demonstrated enough that the two views are quite different
-- So, now we can finally clean up those tables and move into our supposition stage.

IF object_id('tst_tbl_heap','U') is not null drop table tst_tbl_heap
IF object_id('tst_tbl_indexes','U') is not null drop table tst_tbl_indexes
go

Now we could go on and start to examine all the various activities like updates and deletes, and seeks, and more. But, I am not going to. Why? Because we set out to show the differences in those DMV views, and to show that they are not restricted to Indexes. Besides, explaining what all those counts actually mean, and answering a few of those "why" questions, will make for a good Articles for next time !

Instead, let us now consider what we have seen and can summarise from the above activity.

SYS.DM_DB_INDEX_USAGE_STATS

It is cleared when service starts / reboots etc. So keep that firmly in mind.

Good to help identify if an index is used - no entry unless it is.

Good to help identify if a table is used - no entry unless it is.

Has dates to help identify when indexes and tables were last used.

Has counts to help identify frequency of use.

If machine has been up for the entire period that encapsulates all usage, we can highlight unused indexes and consider their removal.

Can look at read / write ratios to determine general activity (mainly read, written etc).

SYS.DM_DB_INDEX_OPERATIONAL_STATS

Exists when table / indexes are created

Shows volumes of activity (plan, rows, pages), not just an instance of a statement.

Far more detailed to help identify what type of activity

Shows row_lock counters indicating lock contention

Shows latch counters showing lower level IO contention

Can measure the cost of having an index, or missing one (scans)

Can start to ascertain the cost of an index by looking at volume of inserts over the type of activity (e.g. scans, but lack of read in the usage_stats).

If considering changing indexes, then you really do need to consider long and hard before you do. While some will stand out fairly clearly, such as written, but not read, you do need to consider the overall impact, and should be profiled, run on a test database, run with the actual execution plans, and consider what the Query Optimiser might be doing if the index is not there, or, the cost of an index if you were to add it. Over indexing can be counter productive.

Armed with the above information, looking at DMV's activity, you can soon work out where possible areas of further inspection are.

Now go read books on line for a more detailed understanding of what each column is saying...

http://msdn.microsoft.com/en-us/library/ms188755.aspx

http://msdn.microsoft.com/en-us/library/ms174281.aspx

Also, you are encouraged to go back and run all the queries again. This time, stop and look at the actual execution plan for the Inserts and Select from our indexed table. The actual execution plan will show you the Clustered Seek, Scans etc. The numbers will make more sense to you. Additionally, there are various postings through the Internet which delight in their technical dexterity as to what each column means.

Making Sense of Index DMV's in SQL Server

Comments (0)