sysindexes - individual index size

anushahanna
anushahanna used Ask the Experts™
on
what exactly is it about the old sysindexes that makes it hard to derive the individual index' sizes? (not the size of all the indexes in a table, but rather each index)

thanks
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Awarded 2008
Awarded 2008
Commented:
afaik, all data for sysindexes I'm 2000 only relates to the size of the clustered index

Author

Commented:
was it by design, or was some info the info that is in sys.indexes not available in sysindexes?
Mark WillsTopic Advisor, Page Editor
Distinguished Expert 2018
Commented:
Well, sysindexes is not really used for size. It is more about structure.

It is not just for clustered indexes, there are three different types of information in sysindexes....

In 'indid' we get the ID of the index:
0 = Heap    (though think this is pretty much hidden in pre 2005)
1 = Clustered index
>1 = Nonclustered index
also in 2000 (pre 2005) each table that had at least one text, ntext, or image column also had a row in sysindexes with indid = 255

What we can glean  from sysindexes are things like number of pages, partitions, keys etc.

In terms of size, it is extremely difficult because there are quite a few different factors like fill-factor, dupes, row_overflow and  index / page fragmentation, and variability of data size it self with variable sized fields. What is does do it give an indication with minimum and maximum row sizes and so on.

That is why in 2005 (and later) there was a fair bit of work in the DMV views to retrieve more appropriate / more accurate management views.

But, indexes are usually a reflection of performance and business requirement, and as such, not necessarily measured in terms of size as much as the results they provide. Interesting question...
Acronis in Gartner 2019 MQ for datacenter backup

It is an honor to be featured in Gartner 2019 Magic Quadrant for Datacenter Backup and Recovery Solutions. Gartner’s MQ sets a high standard and earning a place on their grid is a great affirmation that Acronis is delivering on our mission to protect all data, apps, and systems.

Author

Commented:
>>it is extremely difficult because there are quite a few different factors like fill-factor, dupes, row_overflow and  index / page fragmentation, and variability of data size it self with variable sized fields. What is does do it give an indication with minimum and maximum row sizes and so on.

are these all addressed in sys.indexes in the newer version?

thanks for your beautiful observation about the purpose rather than the cost of indexes.
"indexes are usually a reflection of performance and business requirement, and as such, not necessarily measured in terms of size as much as the results they provide"
Topic Advisor, Page Editor
Distinguished Expert 2018
Commented:
Kind of...

But not really and truly. sys.dm_db_index_physical_stats dynamic management function does show things like min row size, max row size, number of pages, number of fragments, and a lot more. Still, it will not show you the absolute disk space used due to the index.

The idea of the index is their efficiency, and so what we do see is more granularity and details around things like sys.dm_db_index_usage_stats for number of seeks, inserts, etc. We use those to help ascertain the effectiveness and overall efficiency of indexes. If they do not perform then they should be dropped and that will save space.

Probably the hardest to measure are the covering indexes where you can "include" columns. Those included columns will be copied to the index page and so use more pages therefore more space consumed / size needed. Now if one of those included columns is a highly volatile column, or if a highly changeable column is used in an index then that is going to lead to new pages (at least new entries on pages) with various pointers to show the new sequence. The old page is still "consumed" so to speak until such time that your release or de-fragment those indexes, and then release some space.  So, what we end up trying to measure is more about page structure with things like : in-row data, LOB data, and row-overflow data.

So, it is all bound in together and really the measure should be performance levels to warrrant the index. We also have DMV's for missing indexes where stats are built up and you can draw various conclusions about the merits (in terms of performance) to include the index or not.

In terms of calculation the actual disk space, then you can probably work it out manually but you will be doing things like averaging the size of varchar's  and making certain assumptions about how fragemented the index might become (mainly due to out of sequence inserts, or, fill factors, or, changing data etc).

However....

There are a couple of other measures that are good indicators of space used and space reserved....

There is obviously the stored procedure which shows total space for indexes : EXEC sp_spaceused <tablename>

But for individual indexes, we then need to start looking at partitions. So we can use another new DMV : sys.dm_db_partition_stats. Now, there is a bit of work to calculate the exact pages. There is the counter of used_page_count but, also need to consider the type of entry - e.g. heap tables will be in the partition view, and clustered indexes as well - and we know they store dta on those pages, so, have to exclude the data elements for those ie subtract : in_row_used_page_count + lob_used_page_count + row_overflow_used_page_count. Then there are things like pages used for LOB data, so, if an index (not heap or clustered which has index_id = 0 and  1 respectively so look for index_id >=2) and those data elements can be calculated as lob_used_page_count + row_overflow_used_page_count.

So, it does get messy, and really what is it all telling you ?

There is also a secret / undocumented Sp and I dont normally recommend them, but because it is you :) that is: EXEC sp_MSIndexSpace <tablename>  but then for things like primary keys, it will exclude actual data, and yet, if clustered, the data will be held on the page. But still probably the best indicator for your purposes.

Might like to have a quick glance (and vote) http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SQL-Server-2005/A_651-Making-Sense-of-Index-DMV's-in-SQL-Server.html

Author

Commented:
Mark, I can truly say you have a ph.D in index understanding! are you working on SS from SQL 6.5?!!

give me a couple of days more.. i will try to digest a little more of what you are saying..

thanks very much for graciously sharing deep knowledge..

Author

Commented:
Mark, you got my vote. Thanks for that nice small examples comparing the 2 DMVs.

in relation to the above, can you please comment on the code in post # 26408922. Do you think that is accurate?
http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SQL-Server-2005/Q_25086720.html 

Author

Commented:
Mark, I'll close the Q; but if you would like add any comments to it, I will check it later.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial