Solved

Are Filtered Indexes Really that Bad?

Posted on 2014-09-21
14
140 Views
Last Modified: 2014-09-22
I have a what I believe is a strong case for using a filtered index. Based on a simple select statement there would be a 50% savings in the number of rows queried/stored. DBCC SHOW_STATISTICS makes a much stronger case. The table has 5 steps in the histogram and based on EQ_ROWS the savings might be as much as 80%. The statistics are current.

There are a lot of caveats with the SET options I am reading and since the entire code base of the application is not available I feel like the possibility of failed inserts is risky with practically no knowledge of the clients connectivity options.

I am interested in knowing whether there is an easy or even worthwhile method to determine the connection SET options for a particular connection.
0
Comment
Question by:phil916
  • 7
  • 2
  • 2
  • +2
14 Comments
 
LVL 45

Expert Comment

by:Vitor Montalvão
Comment Utility
Do you want to share with us your case?
Maybe we can point you to another solution or give you a better opinion on what you are trying to do.
0
 
LVL 24

Expert Comment

by:Tomas Helgi Johannsson
Comment Utility
Hi!

Creating index either a full index og filtered index is far more better than having a query doing a tablescan.
I personally would recommend a full index on a table. Filtered index would just address a subset of the data and as such consume far less storage while full index would address all data of a table and as such increase performance of a query for all data not just subset of it as filtered index would.

Regards,
     Tomas Helgi
0
 

Author Comment

by:phil916
Comment Utility
Vitor Montalvão

Isn't the where clause unsargeable?

SELECT
Field1
,Field2
,Field3
FROM Table1 AS T1
INNER JOIN Table2 AS T2
ON T1.dbID = T2.dbID -- PK
AND T1.UnitNumID = T2.NumberID -- PK
AND T1.Account = T2.Account -- PK
AND T2.[Type] = 'C'
AND T2.Reverse IS NULL
WHERE TD2.OrderDateTime BETWEEN DATEADD(dd, DATEDIFF(dd, 0, @StartDate), 0)
AND DATEADD(dd, DATEDIFF(dd, 0, @EndDate), 0)

Open in new window


Proposed filtered index: Type predicate reduces number of records by 50-80%. Reverse predicate < 1%

CREATE NONCLUSTERED INDEX ix_MyIndex
ON Table2 (OrderDateTime )
WHERE [type] = 'C'
AND Reverse IS NULL

Open in new window


Tomas Helgi
Creating index either a full index og filtered index is far more better than having a query doing a tablescan.
Agreed and a CI scan is what it is doing without an index.

I personally would recommend a full index on a table. Filtered index would just address a subset of the data and as such consume far less storage while full index would address all data of a table and as such increase performance of a query for all data not just subset of it as filtered index would.

This seems contradictory but I see your point. I can only defend my position in that I have 23 stored procedures and do not anticipate more. 17 of 23 would benefit from the filtered index in some way. The other 6 will not use either index.
0
 
LVL 45

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 100 total points
Comment Utility
Good, you are searching by date.
Have you thought about using table partition? I think it's better than filtered indexes, specially if your table has millions of rows.
0
 
LVL 142

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 200 total points
Comment Utility
I would also try to go the filtered index, unless you could partition the table according to the same conditions, hence resulting "those" queries to go into 1 single partition... (and have the index on only that single partition)
http://technet.microsoft.com/en-us/library/ms187526%28v=sql.105%29.aspx
just an idea ...
0
 

Author Comment

by:phil916
Comment Utility
OK similar advice about the partition I'll have to read up on it.

So even though I have created the index as specified and the example select is representative of the first from /join. There are more joins. The query optimizer is still doing a  clustered index scan. I added some includes suggested by the output list. For that particular batch the missing index is suggesting an index on an entirely different table. On the scan of Table2 there is a big difference in the estimated vs actual rows. 120 K est 15K act. This could make the optimizer pick the scan over the index right?
0
 
LVL 24

Expert Comment

by:Tomas Helgi Johannsson
Comment Utility
Hi!

If you have large table then partitioning as well as having good indexes (partitioned as well) would be
increase your query performance if you use the partition columns in your where clauses of your queries
as the database would be able to do partition pruning (skip searching partitions that does not meet the
where clause criteria  )

Regards,
     Tomas Helgi
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 

Author Comment

by:phil916
Comment Utility
Success!

Who would have thought the best way to troubleshoot why the optimizer does not use an index is to hint it and find out what it thinks are the higher cost operations. It also helps me learn. It was doing a lot of key lookups of columns I would not have guessed it would need. So I added some includes to cover those and then found that it wanted an additional seek predicate.  It wanted my Reverse IS NULL. I thought adding it to the filter would take care of it but it did not. Adding it as an index key work. Logical reads dropped from 138,919 to 410. Without the filter the read remains high at around 134K.

So the original question is still valid. The filtered index has now obvious benefits. What do I do about all of the dire warnings of failed inserts?
0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 200 total points
Comment Utility
I'm not sure who told you that filtered indexes were "bad", but they are deeply mistaken.  FIs are one of the great features of SQL Server.  They are particularly useful when a row has a certain status before being processed, and that value is used as a flag to trigger processing, then the status is reset after processing.

You might have to use the dependencies table to track down any references to the underlying table and make sure the SET options are OK.  For newer versions of SQL Server, those would mostly be the default options anyway.  But for stored procs, you'd want to verify the settings.  If they're not right, you could recreate the proc with appropriate settings specified at proc creation and/or first thing within the proc itself, depending on the specific setting.
0
 

Author Comment

by:phil916
Comment Utility
ScottPletcher -  They were described as great tools for query performance tuning but oh by the way factors, which in my case beyond my control, can make inserts fail. So to me, without a way to monitor for this they became "bad".

Every proc has SET NOCOUNT ON, SET ANSI_NULLS ON and SET QUOTED_IDENTIFIERS ON explicitly. But those are just the queries I use. There are others that do the inserts update and deletes and I do not know what they have. Are my settings enough? I was led to believe that the problem occurred when another connection with incompatible settings  did the insert and only that connection would received the error.
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
To be safe, you could add the require SETtings at the start of the proc:

CREATE PROCEDURE ...
AS
SET NOCOUNT ON;
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT  ON
SET CONCAT_NULL_YIELDS_NULL ON
SET NUMERIC_ROUNDABORT OFF

...rest of proc...
GO

It would be relatively easy to dynamically add that to the start of each proc, if it didn't already have it.

Those are fairly typical default settings anyway.  I use filtered indexes wherever appropriate, and I didn't go thru procs first and haven't had any errors.
0
 

Author Comment

by:phil916
Comment Utility
I checked the discrepancies on the table and the stored procedures all have:

SET ANSI_NULLS ON and SET QUOTED_IDENTIFIERS ON

So is this "safe"?
0
 

Author Comment

by:phil916
Comment Utility
Sorry I did not refresh before I posted. I can't add the others just the two mentioned in my last post.
0
 

Author Closing Comment

by:phil916
Comment Utility
Thanks gentleman
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Confronted with some SQL you don't know can be a daunting task. It can be even more daunting if that SQL carries some of the old secret codes used in the Ye Olde query syntax, such as: (+)     as used in Oracle;     *=     =*    as used in Sybase …
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now