?
Solved

Speed up query on Case Statement

Posted on 2006-05-31
18
Medium Priority
?
466 Views
Last Modified: 2008-02-01
I have a query that is taking 8+ minutes.  I have to pass it through an access DB, so it has to be a view and not a SP.  The table has almost 2 million records.  I have an index on each of the fields I am quering.  Any ideas to speed this up?

SELECT distinct a.caseno, a.ca_status, ca_status_date

case
      when a.ca_status in ('14 ','19 ') then  'PRE-APP'
      when a.ca_status in ('20 ','21' ,'23', '25') then 'APPLICATION'
              when a.ca_status in  ('30 ','31' ,'33', '35') then 'RECON'
       when a.ca_status in ('40 ','41' ,'42', '43', '44', '45', '46', '47', '49', '62') then 'HEARING'
      when a.ca_status in ('50 ','51' ,'52', '56', '53') then 'AC'
      when a.ca_status in ('60 ','61' ,'64' ) then 'FC'
      when a.ca_status in ('24 ','34' ,'48', '54', '63', '66', '67', '68', '16') then 'AWARDED'
      when a.ca_status in ('AA ','AR' ,'AH', 'AC', 'AF') then 'ALT REP ASSIST'
      when a.ca_status in ('12 ','15' ,'36', '37') then 'CDR'
      when a.ca_status= '03' then 'TRACKING' END
as Phase
      
                     
FROM         audit_case a

where
a.ca_close_date is null
and
a.ca_status in ('14 ','19 ', '20 ','21' ,'23', '25', '30 ','31' ,'33', '35','40 ','41' ,'42', '43', '44', '45', '46', '47', '49',
'62','50 ','51' ,'52', '56', '53','60 ','61' ,'64' ,'24 ','34' ,'48', '54', '63', '66', '67', '68', '16','AA ','AR' ,'AH', 'AC', 'AF','12 ','15' ,'36', '37','03' )

0
Comment
Question by:137021
  • 7
  • 3
  • 3
  • +2
16 Comments
 
LVL 13

Expert Comment

by:MikeWalsh
ID: 16800820
Have a look at the query plan and see what is happening.

Is it slow when you run it locally in Query Analyzer, orj ust slow when passing through Access?
0
 
LVL 4

Expert Comment

by:ch70357
ID: 16800821
It seems as if it would be easier to Create a definition table containing the Status Number, then a text field that tells what it is (Hearing, etc) and join based on StatusNumber.
0
 
LVL 13

Expert Comment

by:MikeWalsh
ID: 16800827
How many rows is it returning? How many distinct status codes exist?
0
Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

 
LVL 13

Expert Comment

by:MikeWalsh
ID: 16800896
And yes the reference table may be a good idea, though I am not sure if a CASE statement is so inefficient as to be the only slowdown cause here, especially if your status code is indexed.

What kind of index is on status? Is it an index just on status or does it have other columns also?
0
 
LVL 4

Expert Comment

by:ch70357
ID: 16800909
I believe the slowdown is cause by the IN statement, it is very inefficient.
0
 
LVL 13

Expert Comment

by:MikeWalsh
ID: 16800920
Yes that definitely can be part of the problem. It also could be that it sounds like it is being run through access. That doesn't help a SQL query run any faster either :)
0
 
LVL 13

Expert Comment

by:MikeWalsh
ID: 16800998
Are there a lot more status codes than you are selecting? if there are less you are not selecting you could do a NOT IN..

you could create a reference table

or you could rewrite your query to not worry about selecting the status. have an else in your CASE to set anything not found equal to null, then eliminate the nulls in the where clause.
0
 

Author Comment

by:137021
ID: 16801950
Sorry, I should have been clearer.  This is running through SQL query analyzer first that is how it is coming up with 8 + minutes(8 + minutes wouldn't even fly in Access).  Yes, there are a lot more status codes than the ones that are being selected, so 'not in' isn't an option.  Indexes are on a.caseno, a.ca_status, ca_status_date and just on those fields.  When I run this query without the 'in' clause it only saves me 1 minute.  There is 100 distinct status'.  Also, I was incorrect in the number of records in the Audit_Case table, it is 200 million records not 2 million.  Adding a reference table even took me longer (+1 minute)  The number of records returned are 12,000.

Any other suggestions?
0
 
LVL 4

Accepted Solution

by:
ch70357 earned 500 total points
ID: 16802080
In the temp table, did you make the Status Number the Primary Key, if not this will signifigantly help the performance. Also, maybe you could query only the recods with the status codes that you need into another temp table, making the final query faster.
0
 
LVL 13

Assisted Solution

by:MikeWalsh
MikeWalsh earned 500 total points
ID: 16802189
Look at the query plans. I wonder if a Table Scan is occurring because of your statistics saying it is not worth index seek for your where clause..

Run the query with just the where clause and not the case statement, just get the status code. Does that change the time significantly? My guess is not.

Have you updated your statistics recently? (SP_UPDATESTATS).. They may be out of date. Have you checked for Index fragmentation?
0
 
LVL 70

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 500 total points
ID: 16802995
I, too, think a matching / lookup table would help a lot.

Also, do you *really* need the DISTINCT?  Is caseno not unique already?


Maybe something like this:


CREATE TABLE #statuses (
    ca_status CHAR(3),
    description VARCHAR(20),
    UNIQUE CLUSTERED (ca_status)  --<<-- *important*
    )
INSERT INTO #statuses
SELECT '14 ' AS ca_status, 'PRE-APP' AS description
UNION ALL
SELECT '19 ', 'PRE-APP'
UNION ALL
SELECT '20 ', 'APPLICATION'
--...other values...
UNION ALL
SELECT '37 ', 'CDR'
UNION ALL
SELECT '03 ', 'TRACKING'
ORDER BY 1  --<<-- *important* -- need to load table in order



SELECT a.caseno, a.ca_status, a.ca_status_date,
    s.description as Phase    
                     
FROM       audit_case a

INNER JOIN #statuses s ON s.ca_stauts = a.ca_status

where
a.ca_close_date is null
0
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 16803080
Here's complete code to load the statuses (I think -- naturally verify that all statuses are loaded and that all descs are accurate):


IF OBJECT_ID('tempdb.dbo.#statuses') IS NOT NULL
    DROP TABLE #statuses
CREATE TABLE #statuses (
    ca_status CHAR(3),
    description VARCHAR(20)
    , UNIQUE CLUSTERED (ca_status)
    )
INSERT INTO #statuses
SELECT '14 ' AS ca_status, 'PRE-APP' AS description
UNION ALL
SELECT '19 ', 'PRE-APP'
UNION ALL
SELECT '20 ', 'APPLICATION'
UNION ALL
SELECT '21 ', 'APPLICATION'
UNION ALL
SELECT '23 ', 'APPLICATION'
UNION ALL
SELECT '25 ', 'APPLICATION'
UNION ALL
SELECT '30 ', 'RECON'
UNION ALL
SELECT '31 ', 'RECON'
UNION ALL
SELECT '33 ', 'RECON'
UNION ALL
SELECT '35 ', 'RECON'
UNION ALL
SELECT '40 ', 'HEARING'
UNION ALL
SELECT '41 ', 'HEARING'
UNION ALL
SELECT '42 ', 'HEARING'
UNION ALL
SELECT '43 ', 'HEARING'
UNION ALL
SELECT '44 ', 'HEARING'
UNION ALL
SELECT '45 ', 'HEARING'
UNION ALL
SELECT '46 ', 'HEARING'
UNION ALL
SELECT '47 ', 'HEARING'
UNION ALL
SELECT '49 ', 'HEARING'
UNION ALL
SELECT '62 ', 'HEARING'
UNION ALL
SELECT '50 ', 'AC'
UNION ALL
SELECT '51 ', 'AC'
UNION ALL
SELECT '52 ', 'AC'
UNION ALL
SELECT '56 ', 'AC'
UNION ALL
SELECT '53 ', 'AC'
UNION ALL
SELECT '60 ', 'FC'
UNION ALL
SELECT '61 ', 'FC'
UNION ALL
SELECT '64 ', 'FC'
UNION ALL
SELECT '24 ', 'AWARDED'
UNION ALL
SELECT '34 ', 'AWARDED'
UNION ALL
SELECT '48 ', 'AWARDED'
UNION ALL
SELECT '54 ', 'AWARDED'
UNION ALL
SELECT '63 ', 'AWARDED'
UNION ALL
SELECT '66 ', 'AWARDED'
UNION ALL
SELECT '67 ', 'AWARDED'
UNION ALL
SELECT '68 ', 'AWARDED'
UNION ALL
SELECT '16 ', 'AWARDED'
UNION ALL
SELECT 'AA ', 'ALT REP ASSIST'
UNION ALL
SELECT 'AR ', 'ALT REP ASSIST'
UNION ALL
SELECT 'AH ', 'ALT REP ASSIST'
UNION ALL
SELECT 'AC ', 'ALT REP ASSIST'
UNION ALL
SELECT 'AF ', 'ALT REP ASSIST'
UNION ALL
SELECT '12 ', 'CDR'
UNION ALL
SELECT '15 ', 'CDR'
UNION ALL
SELECT '36 ', 'CDR'
UNION ALL
SELECT '37 ', 'CDR'
UNION ALL
SELECT '03 ', 'TRACKING'
ORDER BY 1
0
 
LVL 70

Expert Comment

by:Scott Pletcher
ID: 16803180
Some limited testing seems to indicate that getting rid of the DISTINCT could be a big payback in performance.
0
 
LVL 13

Expert Comment

by:MikeWalsh
ID: 16803271
I am also curious about the I/O speed and performance. You are scanning (or hopefully seeking) through a large set of records and I/O will be a bottleneck.. I am with scott on the distinct, I repeat the thought of a reference/lookup table and I continue that you should also look at your fragmentation and statistics.
0
 
LVL 10

Assisted Solution

by:StephenCairns
StephenCairns earned 500 total points
ID: 16804535
>Adding a reference table even took me longer (+1 minute)
did you remove your in clause at the same time?
if your look up table contains only the ca_status you want to see your inner join will mean you dont need your in clause.
Oh and make sure your lookup table is indexed too

200 million records can take a long time  - what is the user going to do with this 200 million records?
read it ? (Damn, where was i up to. Just have to reread it again)
print it? (could you order me another truck up paper - I make that 8000 reams of A4 paper at 12 point printing)
have you thought about breaking it down into usable chunks?
0
 
LVL 10

Expert Comment

by:StephenCairns
ID: 16804608
What I'm trying to say there is sometimes as a programmer you are asked to do things that dont make sense, sometimes you have to push back and ask if they have really thought this through.

Biggest print job I ever did was by accident I pressed print without specifying a print range on a mainframe spreadsheet, the printout came back in 2, 5 ream boxes
if you  hit print by accident here it would come back in 400 and thats printing 2 pages to a sheet, double sided assuming each record fitted on a single line.

possibly the break down the results by date?
or date and status.
if you really want 200,000,000 records i would suggest an  access DB application realy is not your tool of choice.

sorry if I sound like I'm ranting here.
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Stored Procedure in Microsoft SQL Server is a powerful feature that it can be used to execute the Data Manipulation Language (DML) or Data Definition Language (DDL). Depending on business requirements, a single Stored Procedure can return differe…
One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…
Suggested Courses

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question