Solved

Are Filtered Indexes Really that Bad?

Posted on 2014-09-21
14
153 Views
Last Modified: 2014-09-22
I have a what I believe is a strong case for using a filtered index. Based on a simple select statement there would be a 50% savings in the number of rows queried/stored. DBCC SHOW_STATISTICS makes a much stronger case. The table has 5 steps in the histogram and based on EQ_ROWS the savings might be as much as 80%. The statistics are current.

There are a lot of caveats with the SET options I am reading and since the entire code base of the application is not available I feel like the possibility of failed inserts is risky with practically no knowledge of the clients connectivity options.

I am interested in knowing whether there is an easy or even worthwhile method to determine the connection SET options for a particular connection.
0
Comment
Question by:phil916
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 2
  • 2
  • +2
14 Comments
 
LVL 50

Expert Comment

by:Vitor Montalvão
ID: 40336047
Do you want to share with us your case?
Maybe we can point you to another solution or give you a better opinion on what you are trying to do.
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 40336259
Hi!

Creating index either a full index og filtered index is far more better than having a query doing a tablescan.
I personally would recommend a full index on a table. Filtered index would just address a subset of the data and as such consume far less storage while full index would address all data of a table and as such increase performance of a query for all data not just subset of it as filtered index would.

Regards,
     Tomas Helgi
0
 

Author Comment

by:phil916
ID: 40336333
Vitor Montalvão

Isn't the where clause unsargeable?

SELECT
Field1
,Field2
,Field3
FROM Table1 AS T1
INNER JOIN Table2 AS T2
ON T1.dbID = T2.dbID -- PK
AND T1.UnitNumID = T2.NumberID -- PK
AND T1.Account = T2.Account -- PK
AND T2.[Type] = 'C'
AND T2.Reverse IS NULL
WHERE TD2.OrderDateTime BETWEEN DATEADD(dd, DATEDIFF(dd, 0, @StartDate), 0)
AND DATEADD(dd, DATEDIFF(dd, 0, @EndDate), 0)

Open in new window


Proposed filtered index: Type predicate reduces number of records by 50-80%. Reverse predicate < 1%

CREATE NONCLUSTERED INDEX ix_MyIndex
ON Table2 (OrderDateTime )
WHERE [type] = 'C'
AND Reverse IS NULL

Open in new window


Tomas Helgi
Creating index either a full index og filtered index is far more better than having a query doing a tablescan.
Agreed and a CI scan is what it is doing without an index.

I personally would recommend a full index on a table. Filtered index would just address a subset of the data and as such consume far less storage while full index would address all data of a table and as such increase performance of a query for all data not just subset of it as filtered index would.

This seems contradictory but I see your point. I can only defend my position in that I have 23 stored procedures and do not anticipate more. 17 of 23 would benefit from the filtered index in some way. The other 6 will not use either index.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 50

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 100 total points
ID: 40336346
Good, you are searching by date.
Have you thought about using table partition? I think it's better than filtered indexes, specially if your table has millions of rows.
0
 
LVL 143

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 200 total points
ID: 40336348
I would also try to go the filtered index, unless you could partition the table according to the same conditions, hence resulting "those" queries to go into 1 single partition... (and have the index on only that single partition)
http://technet.microsoft.com/en-us/library/ms187526%28v=sql.105%29.aspx
just an idea ...
0
 

Author Comment

by:phil916
ID: 40336421
OK similar advice about the partition I'll have to read up on it.

So even though I have created the index as specified and the example select is representative of the first from /join. There are more joins. The query optimizer is still doing a  clustered index scan. I added some includes suggested by the output list. For that particular batch the missing index is suggesting an index on an entirely different table. On the scan of Table2 there is a big difference in the estimated vs actual rows. 120 K est 15K act. This could make the optimizer pick the scan over the index right?
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 40336451
Hi!

If you have large table then partitioning as well as having good indexes (partitioned as well) would be
increase your query performance if you use the partition columns in your where clauses of your queries
as the database would be able to do partition pruning (skip searching partitions that does not meet the
where clause criteria  )

Regards,
     Tomas Helgi
0
 

Author Comment

by:phil916
ID: 40336630
Success!

Who would have thought the best way to troubleshoot why the optimizer does not use an index is to hint it and find out what it thinks are the higher cost operations. It also helps me learn. It was doing a lot of key lookups of columns I would not have guessed it would need. So I added some includes to cover those and then found that it wanted an additional seek predicate.  It wanted my Reverse IS NULL. I thought adding it to the filter would take care of it but it did not. Adding it as an index key work. Logical reads dropped from 138,919 to 410. Without the filter the read remains high at around 134K.

So the original question is still valid. The filtered index has now obvious benefits. What do I do about all of the dire warnings of failed inserts?
0
 
LVL 69

Accepted Solution

by:
Scott Pletcher earned 200 total points
ID: 40336858
I'm not sure who told you that filtered indexes were "bad", but they are deeply mistaken.  FIs are one of the great features of SQL Server.  They are particularly useful when a row has a certain status before being processed, and that value is used as a flag to trigger processing, then the status is reset after processing.

You might have to use the dependencies table to track down any references to the underlying table and make sure the SET options are OK.  For newer versions of SQL Server, those would mostly be the default options anyway.  But for stored procs, you'd want to verify the settings.  If they're not right, you could recreate the proc with appropriate settings specified at proc creation and/or first thing within the proc itself, depending on the specific setting.
0
 

Author Comment

by:phil916
ID: 40337140
ScottPletcher -  They were described as great tools for query performance tuning but oh by the way factors, which in my case beyond my control, can make inserts fail. So to me, without a way to monitor for this they became "bad".

Every proc has SET NOCOUNT ON, SET ANSI_NULLS ON and SET QUOTED_IDENTIFIERS ON explicitly. But those are just the queries I use. There are others that do the inserts update and deletes and I do not know what they have. Are my settings enough? I was led to believe that the problem occurred when another connection with incompatible settings  did the insert and only that connection would received the error.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 40337294
To be safe, you could add the require SETtings at the start of the proc:

CREATE PROCEDURE ...
AS
SET NOCOUNT ON;
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT  ON
SET CONCAT_NULL_YIELDS_NULL ON
SET NUMERIC_ROUNDABORT OFF

...rest of proc...
GO

It would be relatively easy to dynamically add that to the start of each proc, if it didn't already have it.

Those are fairly typical default settings anyway.  I use filtered indexes wherever appropriate, and I didn't go thru procs first and haven't had any errors.
0
 

Author Comment

by:phil916
ID: 40337442
I checked the discrepancies on the table and the stored procedures all have:

SET ANSI_NULLS ON and SET QUOTED_IDENTIFIERS ON

So is this "safe"?
0
 

Author Comment

by:phil916
ID: 40337480
Sorry I did not refresh before I posted. I can't add the others just the two mentioned in my last post.
0
 

Author Closing Comment

by:phil916
ID: 40338080
Thanks gentleman
0

Featured Post

Online Training Solution

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action. Forget about retraining and skyrocket knowledge retention rates.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I'm trying, I really am. But I've seen so many wrong approaches involving date(time) boundaries I despair about my inability to explain it. I've seen quite a few recently that define a non-leap year as 364 days, or 366 days and the list goes on. …
In the first part of this tutorial we will cover the prerequisites for installing SQL Server vNext on Linux.
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Viewers will learn how the fundamental information of how to create a table.

739 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question