Link to home
Start Free TrialLog in
Avatar of knightEknight
knightEknightFlag for United States of America

asked on

indexed views not being used by optimizer

Please run the code below in a scratch database.  On the last query, check the execution plan.  

Why does it do a full table scan instead of doing an index seek on the covering index of the view?

The goal is to do a case-insensitive search on the LastName column, which in my real base table is collation Latin1_General_BIN (the server default here), by way of an index seek, not a scan.  I thought I could do this by creating a view of the table and setting the collation on that field to SQL_Latin1_General_CP1_CI_AS in the view, and index this version of the column on the view.

Thanks
create table dbo.tTest(id int identity(1,1), LastName varchar(80) COLLATE Latin1_General_BIN, FirstName varchar(80) COLLATE Latin1_General_BIN) 
GO

create unique clustered index tx0 on dbo.tTest(id)
GO

insert into dbo.tTest select 'Doe','John'
insert into dbo.tTest select 'Doe','Jane'
insert into dbo.tTest select 'Bach','Hellen'
insert into dbo.tTest select 'Smith','Jane'
insert into dbo.tTest select 'Smith','Mike'
insert into dbo.tTest select 'Jones','Mike'
GO

create view dbo.vTest  WITH SCHEMABINDING  as
select id, (LastName COLLATE SQL_Latin1_General_CP1_CI_AS) as LastName, (FirstName COLLATE SQL_Latin1_General_CP1_CI_AS) as FirstName
from dbo.tTest
GO

create unique clustered index vx0 on dbo.vTest(id) 
GO

create unique index vx1 on dbo.vTest(LastName,FirstName,id)
GO

select id, FirstName, LastName
from dbo.vTest -- with (index=vx1)
where LastName  = 'BACH'  
GO

--  drop view dbo.vTest
--  truncate table dbo.tTest
--  drop table dbo.tTest
GO

-- END OF SCRIPT --

--------------------------------------------------------------------------------------------
/** FYI - I can achieve the overall goal by creating calculated fields on the base table instead of a view (see below), but I would rather get this to work in a view (above) **/
--------------------------------------------------------------------------------------------


create table dbo.tTest(
                        id int identity(1,1), 
                        LastName  varchar(80) COLLATE Latin1_General_BIN, 
                        FirstName varchar(80) COLLATE Latin1_General_BIN,
                        cLastName  as (LastName  COLLATE SQL_Latin1_General_CP1_CI_AS),
                        cFirstName as (FirstName COLLATE SQL_Latin1_General_CP1_CI_AS)
                      ) 
GO

create unique clustered index tx0 on dbo.tTest(id)
GO

insert into dbo.tTest select 'Doe','John'
insert into dbo.tTest select 'Doe','Jane'
insert into dbo.tTest select 'Bach','Hellen'
insert into dbo.tTest select 'Smith','Jane'
insert into dbo.tTest select 'Smith','Mike'
insert into dbo.tTest select 'Jones','Mike'
GO

create unique index tx1 on dbo.tTest(cLastName,cFirstName,id)
GO

select id, cFirstName, cLastName
from dbo.tTest
where cLastName  = 'BACH'

--  drop view dbo.vTest
--  truncate table dbo.tTest
--  drop table dbo.tTest
GO

Open in new window

Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

I'm not sure and I can't run it right now but I would think that creating the view and changing the collation AND creating the compound index both Require the table to be completely scanned when they are created.  And your WHERE using on the LastName doesn't correspond to any index you have created.  You don't have an index on only LastName so it has to scan the table to find only that value.
Avatar of knightEknight

ASKER

The index vx1  is on LastName, FirstName, id, which is sufficient for the WHERE clause containing only LastName.  Further, an index on only LastName would require the engine to do the additional step of going back to the base table for the other selected columns - that is why it is better (in cases like this) to include all selected columns in the index.  Also, see the second part of the script above, which makes use of calculated fields on the base table.  The fact that this works as desired is evidence that in principle this can work.  I just don't understand why the optimizer won't make use of the view index -- even with a query hint telling it to do so!  Any help here?
ASKER CERTIFIED SOLUTION
Avatar of lludden
lludden
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks, good article:  http://msdn.microsoft.com/en-us/library/dd171921%28v=sql.100%29.aspx

In this case, the NOEXPAND hint did the trick:

select id, FirstName, LastName
from dbo.vTest with (NOEXPAND)
where LastName  = 'BACH'  
Very sorry, I accepted the wrong solution.  I had it already working from lluden's suggestion.  Sorry Dave, I'll have to fix this...
... Dave posted just as I accepted, and I just went to the last post in the thread, thinking it was lluden's.  I've requested to have this thread re-opened so I can award the points appropriately.
That's ok, at least you got it working.
I haven't looked over your articles yet Dave, but I agree with the premise you posted - make the collation case insensitive from the beginning.  However, in this case we did not have that option.  I appreciate your contribution and I will read them over.
Thanks to both of you.
You're welcome.
FYI - another good article on this exact subject:  http://www.eggheadcafe.com/software/aspnet/33165884/indexed-views-optimizer-hints-and-noexpand.aspx

The basic conclusion is that if the optimizer decides it has an execution plan that is "good enough" for this query, it runs with it.  Since in my example there are only six rows (1 page) in the table, one plan is probably as good as the next.  It may be the case that if the table has a few million rows in it, the optimizer will choose to use the index of the View (which in this case would be optimal) rather than any index on the base table.

That makes sense to me.  The extra disk activity for the index may swamp the time to just scan the table with only 6 rows.