Solved

Are Filtered Indexes Really that Bad?

Posted on 2014-09-21
14
150 Views
Last Modified: 2014-09-22
I have a what I believe is a strong case for using a filtered index. Based on a simple select statement there would be a 50% savings in the number of rows queried/stored. DBCC SHOW_STATISTICS makes a much stronger case. The table has 5 steps in the histogram and based on EQ_ROWS the savings might be as much as 80%. The statistics are current.

There are a lot of caveats with the SET options I am reading and since the entire code base of the application is not available I feel like the possibility of failed inserts is risky with practically no knowledge of the clients connectivity options.

I am interested in knowing whether there is an easy or even worthwhile method to determine the connection SET options for a particular connection.
0
Comment
Question by:phil916
  • 7
  • 2
  • 2
  • +2
14 Comments
 
LVL 48

Expert Comment

by:Vitor Montalvão
ID: 40336047
Do you want to share with us your case?
Maybe we can point you to another solution or give you a better opinion on what you are trying to do.
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 40336259
Hi!

Creating index either a full index og filtered index is far more better than having a query doing a tablescan.
I personally would recommend a full index on a table. Filtered index would just address a subset of the data and as such consume far less storage while full index would address all data of a table and as such increase performance of a query for all data not just subset of it as filtered index would.

Regards,
     Tomas Helgi
0
 

Author Comment

by:phil916
ID: 40336333
Vitor Montalvão

Isn't the where clause unsargeable?

SELECT
Field1
,Field2
,Field3
FROM Table1 AS T1
INNER JOIN Table2 AS T2
ON T1.dbID = T2.dbID -- PK
AND T1.UnitNumID = T2.NumberID -- PK
AND T1.Account = T2.Account -- PK
AND T2.[Type] = 'C'
AND T2.Reverse IS NULL
WHERE TD2.OrderDateTime BETWEEN DATEADD(dd, DATEDIFF(dd, 0, @StartDate), 0)
AND DATEADD(dd, DATEDIFF(dd, 0, @EndDate), 0)

Open in new window


Proposed filtered index: Type predicate reduces number of records by 50-80%. Reverse predicate < 1%

CREATE NONCLUSTERED INDEX ix_MyIndex
ON Table2 (OrderDateTime )
WHERE [type] = 'C'
AND Reverse IS NULL

Open in new window


Tomas Helgi
Creating index either a full index og filtered index is far more better than having a query doing a tablescan.
Agreed and a CI scan is what it is doing without an index.

I personally would recommend a full index on a table. Filtered index would just address a subset of the data and as such consume far less storage while full index would address all data of a table and as such increase performance of a query for all data not just subset of it as filtered index would.

This seems contradictory but I see your point. I can only defend my position in that I have 23 stored procedures and do not anticipate more. 17 of 23 would benefit from the filtered index in some way. The other 6 will not use either index.
0
Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

 
LVL 48

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 100 total points
ID: 40336346
Good, you are searching by date.
Have you thought about using table partition? I think it's better than filtered indexes, specially if your table has millions of rows.
0
 
LVL 143

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 200 total points
ID: 40336348
I would also try to go the filtered index, unless you could partition the table according to the same conditions, hence resulting "those" queries to go into 1 single partition... (and have the index on only that single partition)
http://technet.microsoft.com/en-us/library/ms187526%28v=sql.105%29.aspx
just an idea ...
0
 

Author Comment

by:phil916
ID: 40336421
OK similar advice about the partition I'll have to read up on it.

So even though I have created the index as specified and the example select is representative of the first from /join. There are more joins. The query optimizer is still doing a  clustered index scan. I added some includes suggested by the output list. For that particular batch the missing index is suggesting an index on an entirely different table. On the scan of Table2 there is a big difference in the estimated vs actual rows. 120 K est 15K act. This could make the optimizer pick the scan over the index right?
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 40336451
Hi!

If you have large table then partitioning as well as having good indexes (partitioned as well) would be
increase your query performance if you use the partition columns in your where clauses of your queries
as the database would be able to do partition pruning (skip searching partitions that does not meet the
where clause criteria  )

Regards,
     Tomas Helgi
0
 

Author Comment

by:phil916
ID: 40336630
Success!

Who would have thought the best way to troubleshoot why the optimizer does not use an index is to hint it and find out what it thinks are the higher cost operations. It also helps me learn. It was doing a lot of key lookups of columns I would not have guessed it would need. So I added some includes to cover those and then found that it wanted an additional seek predicate.  It wanted my Reverse IS NULL. I thought adding it to the filter would take care of it but it did not. Adding it as an index key work. Logical reads dropped from 138,919 to 410. Without the filter the read remains high at around 134K.

So the original question is still valid. The filtered index has now obvious benefits. What do I do about all of the dire warnings of failed inserts?
0
 
LVL 69

Accepted Solution

by:
Scott Pletcher earned 200 total points
ID: 40336858
I'm not sure who told you that filtered indexes were "bad", but they are deeply mistaken.  FIs are one of the great features of SQL Server.  They are particularly useful when a row has a certain status before being processed, and that value is used as a flag to trigger processing, then the status is reset after processing.

You might have to use the dependencies table to track down any references to the underlying table and make sure the SET options are OK.  For newer versions of SQL Server, those would mostly be the default options anyway.  But for stored procs, you'd want to verify the settings.  If they're not right, you could recreate the proc with appropriate settings specified at proc creation and/or first thing within the proc itself, depending on the specific setting.
0
 

Author Comment

by:phil916
ID: 40337140
ScottPletcher -  They were described as great tools for query performance tuning but oh by the way factors, which in my case beyond my control, can make inserts fail. So to me, without a way to monitor for this they became "bad".

Every proc has SET NOCOUNT ON, SET ANSI_NULLS ON and SET QUOTED_IDENTIFIERS ON explicitly. But those are just the queries I use. There are others that do the inserts update and deletes and I do not know what they have. Are my settings enough? I was led to believe that the problem occurred when another connection with incompatible settings  did the insert and only that connection would received the error.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 40337294
To be safe, you could add the require SETtings at the start of the proc:

CREATE PROCEDURE ...
AS
SET NOCOUNT ON;
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT  ON
SET CONCAT_NULL_YIELDS_NULL ON
SET NUMERIC_ROUNDABORT OFF

...rest of proc...
GO

It would be relatively easy to dynamically add that to the start of each proc, if it didn't already have it.

Those are fairly typical default settings anyway.  I use filtered indexes wherever appropriate, and I didn't go thru procs first and haven't had any errors.
0
 

Author Comment

by:phil916
ID: 40337442
I checked the discrepancies on the table and the stored procedures all have:

SET ANSI_NULLS ON and SET QUOTED_IDENTIFIERS ON

So is this "safe"?
0
 

Author Comment

by:phil916
ID: 40337480
Sorry I did not refresh before I posted. I can't add the others just the two mentioned in my last post.
0
 

Author Closing Comment

by:phil916
ID: 40338080
Thanks gentleman
0

Featured Post

How our DevOps Teams Maximize Uptime

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us. Read the use case whitepaper.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

If you have heard of RFC822 date formats, they can be quite a challenge in SQL Server. RFC822 is an Internet standard format for email message headers, including all dates within those headers. The RFC822 protocols are available in detail at:   ht…
I have a large data set and a SSIS package. How can I load this file in multi threading?
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.

790 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question