Solved

Which indexes do i need to create on an sql table in order for a certain query to run fast in al cases?

Posted on 2014-07-26
12
100 Views
Last Modified: 2014-07-31
I have the following query:

SELECT TOP 250 doc_Main.Id,doc_Main.Gid,doc_Main.Name,doc_Main.TimeEnter,doc_Main.Prio,doc_Main.Direction,
doc_Main.Status,doc_Main.CreaterId,doc_Main.Subject,doc_Main.BodyType,doc_Main.Attach,doc_Main.FromAddr,doc_Main.ToAddr
,doc_Main.Gid_LineSet,doc_Main.XML

 
 FROM doc_Main
 

 WHERE
(doc_Main.Id_RoleSet IN (1,2,....) AND
((CONTAINS((doc_Main.Subject, doc_Main.Body), '"bulk"'))

 ORDER BY doc_Main.TimeEnter DESC


doc_main table is a big one with over 6 million records.

I have tried different combinations of clustered indexes and nonclustered indexes after reading on the internet but still i can't find the right combination.

The problem is that if i define some indexes or forced to use some it will work fine if the word searched in the full text catalog is very often present in the table but not when it has few appearances. If i change those indexes or define others it starts to work for few appearances but not for many.

Please help me to find the right approach to find out which indexes and statistics have to be created on this table to work fast in all cases.
0
Comment
Question by:Nordicit
12 Comments
 
LVL 18

Expert Comment

by:Peter Hutchison
Comment Utility
A database can only only have one clustered index (which contains data) and upto 249 non-clustered indexes (contains pointers to data). Having any index will always speed up queries as it doesn't have to trawl through the entire table.
The best way to optimize indexes is to use the DAtabase Engine Tuning Advisor to find out what indexes you need.
0
 

Author Comment

by:Nordicit
Comment Utility
i have tried different combination of clustered and  non clustered indexes. Also have tried to use Database Engine Tuning Advisor. It doesn't show any improvements that can be made.

I will try to explain in more details:

If are many matches done by the full text catalog than the optimal execution would be to scan through the clustered index which has the first column Time enter and match the result of full text catalaog.

If are few matches done by the full text catalog than the optimal execution would be to seek through a non clustered index which has the first column Id(second Id_RoleSet and third TimeEnter desc) and match the results of full text catalog.

If I force the queries using with(index()) than the queries perform optimal. But for me is impossible to know in general if a word is often present in the table.

Another approach that i have tried is to use parameter sniffing. So when the query is executed the sql would include the search keyword in the execution plan  decision algorithm.

Thanks for your reply and hope that the problem is more clear now
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
How many items do you expect in the IN clause?
0
 

Author Comment

by:Nordicit
Comment Utility
Maximum 6300 but in average 3000. That part is quite optimal in all cases.

In the execution plan is always high the cost of the full text catalog search in conjunction with one of the indexes.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
Maximum 6300 but in average 3000. That part is quite optimal in all cases.
I would look into saving that many items into a table, so that you can JOIN against it instead of using IN.  This way you can build a non-dynamic query (preferably in a Stored Procedure) that will always give you consistent and above par results as it can re-use the same plan over and over again.
0
 

Author Comment

by:Nordicit
Comment Utility
We are doing that already i just didn't include that part so it looks more clear the problematic part.

I think I found a solution. I will post after additional testing if it proves to be the correct one.

Thanks again for your help.
0
Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
For this specific query, you'd want the clustered index to be on:
Id_RoleSet

To speed up the text searching, you could also consider "pre-searching" common phrases.   I suggest using bit masks in an int, or smallint, column for that.  

And/or you could cache previous search results using a separate table, in case the same text searches are done multiple times.
0
 

Author Comment

by:Nordicit
Comment Utility
This is the query that works in most cases optimal now

create table #docs (Id  int)

insert into #docs
select doc_main.Id
from doc_Main
where Id_RoleSet in (1,2,..........)
      and
      (
      CONTAINS((doc_Main.Subject, doc_Main.Body), '"regards"')  
      )

create nonclustered index temp_docs on #docs(Id)

SELECT TOP 250 doc_Main.Id,doc_Main.Gid,doc_Main.Name,doc_Main.TimeEnter,doc_Main.Prio,doc_Main.Direction,doc_Main.Status,doc_Main.CreaterId,doc_Main.Subject, doc_Main.Body,
doc_Main.BodyType,doc_Main.Attach,doc_Main.FromAddr,doc_Main.ToAddr,doc_Main.Gid_LineSet,doc_Main.XML, COALESCE(doc_Attrib.IsRead,0) as DocRead ,
(SELECT Ref FROM doc_Ref WHERE doc_Ref.Id_Doc=doc_Main.Id And doc_Ref.RefType= 2) as ExtraField2  

FROM doc_Main
      LEFT JOIN doc_Attrib ON doc_Main.Id = doc_Attrib.Id_Doc AND doc_Attrib.Gid_User = '{c6358127-8944-48dc-a8df-7a6da3826cee}'       
WHERE
       doc_Main.Id in (select Id
      from #docs
      
       )
ORDER BY doc_Main.TimeEnter DESC
--option(recompile)

drop table #docs


It looks extremely ugly.

I have tried with Id_RoleSet as clustered index but it does not help a lot.

Another limitation set for all the queries in the system is that there is a timeout of 30 seconds. Which i would like to keep. So if it would take more than 30 seconds anyway there is no way to cache because the query is canceled. If is under 30 than no need.

Thanks for your advice.
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 500 total points
Comment Utility
I would make the following changes:
CREATE TABLE #docs (Id int NOT NULL)

INSERT  INTO #docs (Id)
        SELECT  DISTINCT 
		m.Id
        FROM    doc_Main
        WHERE   Id_RoleSet IN (1, 2, ...)
                AND CONTAINS ((m.[Subject], m.Body), '"regards"')

CREATE UNIQUE CLUSTERED INDEX temp_docs ON #docs(Id)

SELECT TOP 250
        m.Id,
        m.Gid,
        m.Name,
        m.TimeEnter,
        m.Prio,
        m.Direction,
        m.[Status],
        m.CreaterId,
        m.Subject,
        m.Body,
        m.BodyType,
        m.[Attach],
        m.FromAddr,
        m.ToAddr,
        m.Gid_LineSet,
        m.[XML],
        COALESCE(a.IsRead, 0) AS DocRead,
        Ref AS ExtraField2
FROM    doc_Main m
	INNER JOIN #docs d ON m.Id = d.Id
        LEFT JOIN doc_Attrib a ON m.Id = a.Id_Doc
                                AND a.Gid_User = '{c6358127-8944-48dc-a8df-7a6da3826cee}'
	-- If possible, change the following to an INNER JOIN  
	-- and put the r.RefType = 2 the condition in the WHERE clause.
        LEFT JOIN doc_Ref r ON m.Id = r.Id_Doc AND r.RefType = 2
ORDER BY m.TimeEnter DESC 

DROP TABLE #docs

Open in new window

0
 

Author Comment

by:Nordicit
Comment Utility
It works really nice. Thank you. I will do some additional tests tomorrow.
CREATE UNIQUE CLUSTERED INDEX temp_docs ON #docs(Id)
I have replaced it with
CREATE UNIQUE NONCLUSTERED INDEX temp_docs ON #docs(Id)

it looks that the clustered index on temp table is much more expensive and is not needed in this cases as far as i can see.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
it looks that the clustered index on temp table is much more expensive and is not needed in this cases as far as i can see.
It is fundamental to achieve best performance.  But you are the best judge of that.
0
 

Author Closing Comment

by:Nordicit
Comment Utility
Thanks for your help. It works well in most of the cases.
0

Featured Post

Get up to 2TB FREE CLOUD per backup license!

An exclusive Black Friday offer just for Expert Exchange audience! Buy any of our top-rated backup solutions & get up to 2TB free cloud per system! Perform local & cloud backup in the same step, and restore instantly—anytime, anywhere. Grab this deal now before it disappears!

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
Server 2008 Cluster Fail-over Errors 5 39
c# code 19 56
Recurring Excel Timelime for Veeam 2 28
SQL Server 2012 Row Selection 2 28
Everyone has problem when going to load data into Data warehouse (EDW). They all need to confirm that data quality is good but they don't no how to proceed. Microsoft has provided new task within SSIS 2008 called "Data Profiler Task". It solve th…
JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now