Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

SQL Index

Posted on 2011-09-12
9
360 Views
Last Modified: 2012-06-22
I have the following query which runs against a table with 300,000,000 records. It takes about 2 hours to run which is much too long. The DTH_CallRecordMaster table does already have a non-unique, non-clustered index on CustomerID and the execution plan shows it it is the index seek on the CustomerID index is 95% of the work.

What are my options to improve performance of this query?

Thanks


      
 
select CustomerID, count(CustomerID), sum(CallCost) from DTH_CallRecordMaster (nolock)
		where CallType in (5, 7, 10, 11, 12, 13, 22, 23) and cast(StartTime as date) between '2011-08-01' and '2011-08-31' 
		  and CustomerID in (select CustomerID from DTH_CustomerMaster (nolock) where CycleID = 'MONTHLY-15')
			group by CustomerID

Open in new window

0
Comment
Question by:dthansen
9 Comments
 
LVL 18

Expert Comment

by:x-men
ID: 36524814
have you runned the Database Tunning Advisor?
0
 
LVL 23

Expert Comment

by:Racim BOUDJAKDJI
ID: 36524855
Please post the showplan.
0
 
LVL 12

Assisted Solution

by:jagssidurala
jagssidurala earned 100 total points
ID: 36524929
you can write the above query like below.

select       A.CustomerID,
      count(A.CustomerID),
      sum(A.CallCost)
from       DTH_CallRecordMaster as A WITH(nolock)
Inner
Join      DTH_CustomerMaster as B On B.CustomerID  =  A.CustomerID
where       A.CallType in (5, 7, 10, 11, 12, 13, 22, 23)
and       cast(A.StartTime as date) between '2011-08-01' and '2011-08-31'
and       B.CycleID = 'MONTHLY-15'
group by A.CustomerID

and also create index with following columns calltype, starttime, CustomerId, CallCost for table A abd
Customerid,Cycleid for table B
0
Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

 

Author Comment

by:dthansen
ID: 36525586
The lookup to the DTH_CustomerMaster for cycleID is 0% if the execution plan. I'm not worried about that.

Should I create a new index with those columns or simply add those columns to the existing index that is on CustomerID?

Thanks.
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 300 total points
ID: 36526681
Make sure you have the following indexes and try is as below:
DTH_CallRecordMaster: CustomerID, CallType, StartTime
DTH_CustomerMaster: CustomerID, CycleID

If that does not make any difference, buy a faster server.
SELECT  r.CustomerID,
        COUNT(*),
        SUM(r.CallCost)
FROM    DTH_CallRecordMaster r (NOLOCK)
        INNER JOIN DTH_CustomerMaster c (NOLOCK) ON r.CustomerID = c.CustomerID
WHERE   r.CallType IN (5, 7, 10, 11, 12, 13, 22, 23)
        AND r.StartTime >= '2011-08-01'
        AND r.StartTime < '2011-08-31'
        AND c.CycleID = 'MONTHLY-15'
GROUP BY r.CustomerID

Open in new window

0
 

Author Comment

by:dthansen
ID: 36527077
1. The CustomerMaster has only 40 rows while the CallRecordMaster has 300,000,000. Isn't joining those tables more expensive than a simple IN against CustomerMaster?

2. Should I add CallType, StartTime to the existing index on CustomerID or create a new index.

3. Why not have 'CallCost' in the index as an include so it is a covering index?

4. Why did you choose to replace the 'between' with a >= and <. Is that more efficient?

Thanks.
0
 
LVL 75

Assisted Solution

by:Anthony Perkins
Anthony Perkins earned 300 total points
ID: 36527163
1. What does the Execution Plan tell you in each case?  Why don't you test it?  Use the one that best fits your requirements.
2. I would define them as Keys in a single index.  But test it out.  Inspect the Execution Plan.
3. Yes.
4. I suspect cast(A.StartTime as date) between '2011-08-01' and '2011-08-31' is not a SARGable function, so I replaced it with something that would take advantage on an index on StartTime.   If you want to leave it then don't bother with adding an index on StartTime, it may not use it.  Hence lousy performance.  Again test and see for yourself.
0
 
LVL 15

Assisted Solution

by:Anuj
Anuj earned 100 total points
ID: 36527258
Make sure that you have proper indexes as keys
CustomerID, CallType,StartTime Include(CallCost)  on DTH_CallRecordMaster
CustomerID, CycleID  On DTH_CustomerMaster

Also, check your indexes are properly defragmented and the statistics are updated.

1. I assume, joining between smaller and larger table use Hash Join, so both IN and Inner join will use Hash Join.
2. CustomerID, CycleID  On DTH_CustomerMaster
3. Including CallCost makes it covered index and remove lookup.
4. SQL Server usually convert between to >= and <=, this will not make any difference. As @Acperkins suggested these arguments may be non sargable so, check your data type of StartTime matches both column and variable.
0
 
LVL 75

Assisted Solution

by:Anthony Perkins
Anthony Perkins earned 300 total points
ID: 36528785
>>SQL Server usually convert between to >= and <=, this will not make any difference.<<
Which reminds me that the condition I posted is not correct:
Instead of:
cast(StartTime as date) between '2011-08-01' and '2011-08-31'
Use:
AND r.StartTime >= '2011-08-01'
AND r.StartTime < '2011-09-01'

That is assuming that the following is not SARGable:
cast(StartTime as date) between '2011-08-01' and '2011-08-31'

0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.

839 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question