Solved

Are Filtered Indexes Really that Bad?

Posted on 2014-09-21
14
143 Views
Last Modified: 2014-09-22
I have a what I believe is a strong case for using a filtered index. Based on a simple select statement there would be a 50% savings in the number of rows queried/stored. DBCC SHOW_STATISTICS makes a much stronger case. The table has 5 steps in the histogram and based on EQ_ROWS the savings might be as much as 80%. The statistics are current.

There are a lot of caveats with the SET options I am reading and since the entire code base of the application is not available I feel like the possibility of failed inserts is risky with practically no knowledge of the clients connectivity options.

I am interested in knowing whether there is an easy or even worthwhile method to determine the connection SET options for a particular connection.
0
Comment
Question by:phil916
  • 7
  • 2
  • 2
  • +2
14 Comments
 
LVL 46

Expert Comment

by:Vitor Montalvão
ID: 40336047
Do you want to share with us your case?
Maybe we can point you to another solution or give you a better opinion on what you are trying to do.
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 40336259
Hi!

Creating index either a full index og filtered index is far more better than having a query doing a tablescan.
I personally would recommend a full index on a table. Filtered index would just address a subset of the data and as such consume far less storage while full index would address all data of a table and as such increase performance of a query for all data not just subset of it as filtered index would.

Regards,
     Tomas Helgi
0
 

Author Comment

by:phil916
ID: 40336333
Vitor Montalvão

Isn't the where clause unsargeable?

SELECT
Field1
,Field2
,Field3
FROM Table1 AS T1
INNER JOIN Table2 AS T2
ON T1.dbID = T2.dbID -- PK
AND T1.UnitNumID = T2.NumberID -- PK
AND T1.Account = T2.Account -- PK
AND T2.[Type] = 'C'
AND T2.Reverse IS NULL
WHERE TD2.OrderDateTime BETWEEN DATEADD(dd, DATEDIFF(dd, 0, @StartDate), 0)
AND DATEADD(dd, DATEDIFF(dd, 0, @EndDate), 0)

Open in new window


Proposed filtered index: Type predicate reduces number of records by 50-80%. Reverse predicate < 1%

CREATE NONCLUSTERED INDEX ix_MyIndex
ON Table2 (OrderDateTime )
WHERE [type] = 'C'
AND Reverse IS NULL

Open in new window


Tomas Helgi
Creating index either a full index og filtered index is far more better than having a query doing a tablescan.
Agreed and a CI scan is what it is doing without an index.

I personally would recommend a full index on a table. Filtered index would just address a subset of the data and as such consume far less storage while full index would address all data of a table and as such increase performance of a query for all data not just subset of it as filtered index would.

This seems contradictory but I see your point. I can only defend my position in that I have 23 stored procedures and do not anticipate more. 17 of 23 would benefit from the filtered index in some way. The other 6 will not use either index.
0
 
LVL 46

Assisted Solution

by:Vitor Montalvão
Vitor Montalvão earned 100 total points
ID: 40336346
Good, you are searching by date.
Have you thought about using table partition? I think it's better than filtered indexes, specially if your table has millions of rows.
0
 
LVL 142

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 200 total points
ID: 40336348
I would also try to go the filtered index, unless you could partition the table according to the same conditions, hence resulting "those" queries to go into 1 single partition... (and have the index on only that single partition)
http://technet.microsoft.com/en-us/library/ms187526%28v=sql.105%29.aspx
just an idea ...
0
 

Author Comment

by:phil916
ID: 40336421
OK similar advice about the partition I'll have to read up on it.

So even though I have created the index as specified and the example select is representative of the first from /join. There are more joins. The query optimizer is still doing a  clustered index scan. I added some includes suggested by the output list. For that particular batch the missing index is suggesting an index on an entirely different table. On the scan of Table2 there is a big difference in the estimated vs actual rows. 120 K est 15K act. This could make the optimizer pick the scan over the index right?
0
 
LVL 25

Expert Comment

by:Tomas Helgi Johannsson
ID: 40336451
Hi!

If you have large table then partitioning as well as having good indexes (partitioned as well) would be
increase your query performance if you use the partition columns in your where clauses of your queries
as the database would be able to do partition pruning (skip searching partitions that does not meet the
where clause criteria  )

Regards,
     Tomas Helgi
0
Control application downtime with dependency maps

Visualize the interdependencies between application components better with Applications Manager's automated application discovery and dependency mapping feature. Resolve performance issues faster by quickly isolating problematic components.

 

Author Comment

by:phil916
ID: 40336630
Success!

Who would have thought the best way to troubleshoot why the optimizer does not use an index is to hint it and find out what it thinks are the higher cost operations. It also helps me learn. It was doing a lot of key lookups of columns I would not have guessed it would need. So I added some includes to cover those and then found that it wanted an additional seek predicate.  It wanted my Reverse IS NULL. I thought adding it to the filter would take care of it but it did not. Adding it as an index key work. Logical reads dropped from 138,919 to 410. Without the filter the read remains high at around 134K.

So the original question is still valid. The filtered index has now obvious benefits. What do I do about all of the dire warnings of failed inserts?
0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 200 total points
ID: 40336858
I'm not sure who told you that filtered indexes were "bad", but they are deeply mistaken.  FIs are one of the great features of SQL Server.  They are particularly useful when a row has a certain status before being processed, and that value is used as a flag to trigger processing, then the status is reset after processing.

You might have to use the dependencies table to track down any references to the underlying table and make sure the SET options are OK.  For newer versions of SQL Server, those would mostly be the default options anyway.  But for stored procs, you'd want to verify the settings.  If they're not right, you could recreate the proc with appropriate settings specified at proc creation and/or first thing within the proc itself, depending on the specific setting.
0
 

Author Comment

by:phil916
ID: 40337140
ScottPletcher -  They were described as great tools for query performance tuning but oh by the way factors, which in my case beyond my control, can make inserts fail. So to me, without a way to monitor for this they became "bad".

Every proc has SET NOCOUNT ON, SET ANSI_NULLS ON and SET QUOTED_IDENTIFIERS ON explicitly. But those are just the queries I use. There are others that do the inserts update and deletes and I do not know what they have. Are my settings enough? I was led to believe that the problem occurred when another connection with incompatible settings  did the insert and only that connection would received the error.
0
 
LVL 69

Expert Comment

by:ScottPletcher
ID: 40337294
To be safe, you could add the require SETtings at the start of the proc:

CREATE PROCEDURE ...
AS
SET NOCOUNT ON;
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET ARITHABORT  ON
SET CONCAT_NULL_YIELDS_NULL ON
SET NUMERIC_ROUNDABORT OFF

...rest of proc...
GO

It would be relatively easy to dynamically add that to the start of each proc, if it didn't already have it.

Those are fairly typical default settings anyway.  I use filtered indexes wherever appropriate, and I didn't go thru procs first and haven't had any errors.
0
 

Author Comment

by:phil916
ID: 40337442
I checked the discrepancies on the table and the stored procedures all have:

SET ANSI_NULLS ON and SET QUOTED_IDENTIFIERS ON

So is this "safe"?
0
 

Author Comment

by:phil916
ID: 40337480
Sorry I did not refresh before I posted. I can't add the others just the two mentioned in my last post.
0
 

Author Closing Comment

by:phil916
ID: 40338080
Thanks gentleman
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I'm trying, I really am. But I've seen so many wrong approaches involving date(time) boundaries I despair about my inability to explain it. I've seen quite a few recently that define a non-leap year as 364 days, or 366 days and the list goes on. …
Composite queries are used to retrieve the results from joining multiple queries after applying any filters. UNION, INTERSECT, MINUS, and UNION ALL are some of the operators used to get certain desired results.​
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
Viewers will learn how the fundamental information of how to create a table.

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

25 Experts available now in Live!

Get 1:1 Help Now