Solved

-----** Improving Query Performance **-----

Posted on 2004-09-07
24
799 Views
Last Modified: 2007-12-19
Hi,
The following query runs really slow.
I have tryed different solutions like using #temp tables, not using the distinct keyword but the performance did not improve by much.  What can I do to improve the performance of this query.


Thanks

QUERY:

select GETDATE()
 

        SELECT DISTINCT
                BATCH_ID
        FROM    
                DETAIL
        WHERE
                ENTITY IN ('BL', 'CA', 'CC', 'RC')
        AND
                BATCH_ID NOT IN
                        (
                           SELECT DISTINCT
                                   BATCH_ID
                           FROM
                                   SUMMARY
                        )
          ORDER BY
           SUBSTRING(BATCH_ID,3,14) DESC


select GETDATE()
0
Comment
Question by:danidak
  • 9
  • 8
  • 4
  • +3
24 Comments
 
LVL 19

Accepted Solution

by:
grant300 earned 220 total points
Comment Utility
Well, you did not give us much to go on.  No table or index definitions, no cardinality, and no run times.  As such, we can only speak generically about the type of query you have here.  Lets start with the bad news:
1) you have a NOT IN subquery that probably covers a substantial percentage of the domain of BATCH_ID
2) you are doing a DISTINCT operation
3) you are performing an ORDER BY on a function that is also descending.

I presume you have a bunch of records and not just a few hundred otherwise you would have some fundimental problem with your database/server tuning.  Just out of curiousity, what is the cardinality of these tables and what indexes do you have?

In general, you are going to be better off rewriting the query as an outer joing between DETAIL and SUMMARY and looking for the records where SUMMARY.BATCH_ID IS NULL  This is almost always better than a NOT IN, which is a known pig.

The second thing I would do is change from the DISTINCT to a GROUP BY.  Although the operations would seem on the surface to be the same, GROUP BY, for whatever strange reason seems to perform better than DISTINCT.

Make sure you have well formed unique indexes covering BATCH_ID in both the SUMMARY and DETAIL tables.  The index on the DETAIL table will undoubtedly have more fields in it than just BATCH_ID in order to have a unique key.  The BATCH_ID should be the first field in the key definition.

I am guessing there are some other fields in the query that you removed for clarity.  The best situation is where all of the fields returned by the query are in the index.  When you go out of your way, maybe adding an extra field or two to the end of an index just to support a particular query(s), this is called a coverage index.  The query optimizer will use only the index and never touch the table itself during execution.

You should become familier with the SET SHOWPLAN ON command and the results you get when you run a query that way.  There are also SET STATISTICS IO and SET STATISTICS TIME that you can use to further decypher what the database is doing under the covers.

You can also control how the optimizer does things, to some extent, by hinting.  Index hints are given in the FROM clause of a SQL statement; something like
... FROM DETAIL (INDEX BTCH_DTL_IDX), SUMMARY
If you can't quite convince the optimzer to use the index you want it to, this is usually enough to push it over the top.

There is also a SET FORCEPLAN ON (and OFF) command.  When ON, the optimizer defers to the order the tables appear in the FROM clause to decide how to fulfill the query.

Your ORDER BY clause is a problem.  You probably can't do much about the performance of it unless you break the query into two pieces.  The first does the outer join and group by and dumps the results, plus the sort column (substring(BATCH_ID,3,14)  into a temp table.  The second does a simple SELECT from the temp table with the ORDER BY DESC on the sort column.

Hope this helps.

Bill
0
 

Author Comment

by:danidak
Comment Utility
well,
I have two NONCLUSTERED INDEXES, one on the BATCH_ID and one on the ENTITY,

Run time: 11min
Table size: the DETAIL table has over 13 mill records.
0
 
LVL 19

Expert Comment

by:grant300
Comment Utility
Drop those two indexes (I presume they are on the DETAIL table, and create one on (ENTITY, BATCH_ID).  This becomes a coverage index.  At the very least, you should be able to get the query to run out of just the index.  You may have to reverse the order of the fields in the index depending on what the optimizer wants to do with the rest of the query.  You may also find it advantageous to add one more field on the index that makes it unique; something like TRANS_DT or whatever you have that will work.  The optimiser has much better luck with unique indexes than highly non-unique ones.

BTW, do you have any indexes on the SUMMARY table?

Also, do you run UPDATE STATISTICS and SP_RECOMPILE on these tables prior to running this query?  The optimiser will do some pretty dumb things if you tell it lies about what is going on in the tables and indexes.

Bill
0
 

Author Comment

by:danidak
Comment Utility
I do have the BATCH_ID index on the SUMMARY table, also I have another index on the DETAIL table (DTL_CK) = unique key.
I think I have tried the coverage index before (BATCH_ID, ENTITY) but did not gain much.

What about a new coverage index (BATCH_ID, ENTITY, DTL_CK) ???

I'm not running the  UPDATE STATISTICS and SP_RECOMPILE every time only when I add/remove/update INDEXES, the reason being, it takes some time for them to run.

regards,
Dan
0
 
LVL 19

Expert Comment

by:grant300
Comment Utility
That sounds like it is worth a shot.  Again, you may want to try (ENTITY, BATCH_ID, DTL_CK).  Without the benefit of the SHOWPLAN output, it is pretty hard to know exactly what you need to do.

Try a couple of these variations with SHOWPLAN turned on, capture the output and post it.  I will be clad to help you interpret it.

The UPDATE STATISTICS should be run after there is a substantial change in the number of rows in a table or a substantial change in the distribution along an index.  The statistics DO NOT need to be manually updated when you add/remove indexes because they get updated at index build time!  You should, however, always run SP_RECOMPILE on the table after you have changed indexes around so that any stored procedure can form new query plans.

Bill.

0
 
LVL 6

Expert Comment

by:ChrisKing
Comment Utility
try the following

SELECT DISTINCT BATCH_ID
INTO #check
FROM DETAIL
WHERE ENTITY IN ('BL', 'CA', 'CC', 'RC')

SELECT BATCH_ID
FROM #check c
WHERE NOT EXISTS ( SELECT * FROM SUMMARY s WHERE s.BATCH_ID = c.BATCH_ID )
ORDER BY SUBSTRING(BATCH_ID,3,14) DESC

DROP TABLE #check

the ideal indexes for this query would be:
    DETAIL:      ENTITY, BATCH_ID       (just ENTITY should be enough but adding BATCH_ID would be a little better)
    SUMMARY:  BATCH_ID


0
 
LVL 2

Expert Comment

by:mansoor_a_khan
Comment Utility
Change to this

select GETDATE()
 

        SELECT DISTINCT
                SUBSTRING(BATCH_ID,3,14) -- NEW
                BATCH_ID
        FROM    
                DETAIL
        WHERE
                ENTITY IN ('BL', 'CA', 'CC', 'RC')
        AND
                BATCH_ID NOT IN
                        (
                           SELECT DISTINCT
                                   BATCH_ID
                           FROM
                                   SUMMARY
                        )
          ORDER BY
          SUBSTRING(BATCH_ID,3,14) DESC


select GETDATE()



Shud do the trick
0
 
LVL 19

Assisted Solution

by:grant300
grant300 earned 220 total points
Comment Utility
I don't think a query with two DISTINCT operations and a NOT IN is ever going to be optimal.

I will write the outer join query for you:

SELECT D.BATCH_ID
  FROM SUMMARY S RIGHT JOIN
           (SELECT BATCH_ID, SUBSTRING(BATCH_ID,3,14)
              FROM DETAIL
            WHERE ENTITY IN ('BL','CA','CC','RC')
            GROUP BY BATCH_ID, SUBSTRING(BATCH_ID,3,14) PSUEDO_SORT) D
      ON S.BATCH_ID = D.BATCH_ID
WHERE S.BATCH_ID IS NULL
ORDER BY  PSUEDO_SORT DESC

I think you will find this runs well and is easier to optimise than a NOT IN or NOT EXISTS formulation.

Try it an let us know how it runs.

Bill
0
 

Author Comment

by:danidak
Comment Utility
Hi,

I really appreciate all of your efforts.

I did some more testing but with no much success,

-- ChrisKing - speed wise, no improvement
-- mansoor_a_khan - same here,


-- grant3000 -
           it looked promising but I get a syntax error  (-->) on the
           
- Incorrect syntax near the keyword SELECT           ErrorNumber(156)
- Incorrect syntax near the keyword PSUEDO_SORT) D   ErrorNumber(102)

SELECT D.BATCH_ID
  FROM SUMMARY S RIGHT JOIN
   -->        (SELECT BATCH_ID, SUBSTRING(BATCH_ID,3,14)
              FROM DETAIL
            WHERE ENTITY IN ('BL','CA','CC','RC')
            GROUP BY BATCH_ID, SUBSTRING(BATCH_ID,3,14) --> PSUEDO_SORT) D
      ON S.BATCH_ID = D.BATCH_ID
WHERE S.BATCH_ID IS NULL
--> ORDER BY  PSUEDO_SORT DESC

I'll tried to make it work but we no success.
Any clues, on how I could make this work.

Thanks,
Dan

0
 
LVL 19

Expert Comment

by:grant300
Comment Utility
The correlation name, PSUEDO_SORT, was in the wrong spot.

SELECT D.BATCH_ID
  FROM SUMMARY S RIGHT JOIN
           (SELECT BATCH_ID, SUBSTRING(BATCH_ID,3,14) PSUEDO_SORT
              FROM DETAIL
            WHERE ENTITY IN ('BL','CA','CC','RC')
            GROUP BY BATCH_ID, SUBSTRING(BATCH_ID,3,14)) D
      ON S.BATCH_ID = D.BATCH_ID
WHERE S.BATCH_ID IS NULL
ORDER BY  PSUEDO_SORT DESC

This works find with my test setup as follows:

CREATE TABLE DETAIL (BATCH_ID VARCHAR(14), ENTITY VARCHAR(2), COL3 INT)
GO
CREATE TABLE SUMMARY (BATCH_ID VARCHAR(14), COL4 INT)
GO
INSERT INTO DETAIL VALUES ('12345678910114','BL',2)
INSERT INTO DETAIL VALUES ('12345678910114','CA',2)
INSERT INTO DETAIL VALUES ('12345678910114','XX',2)
INSERT INTO DETAIL VALUES ('21345678910117','BL',2)
INSERT INTO DETAIL VALUES ('40345678910119','BL',2)
INSERT INTO DETAIL VALUES ('40345678910119','CA',2)
GO
INSERT INTO SUMMARY VALUES ('15345678913333',1)
INSERT INTO SUMMARY VALUES ('21345678910117',2)
INSERT INTO SUMMARY VALUES ('33456783424556',3)
GO

Give it a shot and let us know how it worked.

Bill
0
 

Author Comment

by:danidak
Comment Utility
Bill,

I still get the same errors as before, I'm testing this on Sybase 12.0

Thanks,
Daniel
0
 
LVL 19

Assisted Solution

by:grant300
grant300 earned 220 total points
Comment Utility
Oops.  Inline views (or derived tables) is a 12.5.1+ feature.  That's O.K., you can use a temp table to get the same result.  It might even be the way the optimizer is doing it anyway.  Try this:

BEGIN
  SELECT BATCH_ID, SUBSTRING(BATCH_ID,3,14) PSUEDO_SORT
     INTO #D
    FROM DETAIL
  WHERE ENTITY IN ('BL','CA','CC','RC')
  GROUP BY BATCH_ID, SUBSTRING(BATCH_ID,3,14)

  SELECT D.BATCH_ID
    FROM SUMMARY S RIGHT JOIN #D D
        ON S.BATCH_ID = D.BATCH_ID
  WHERE S.BATCH_ID IS NULL
  ORDER BY  PSUEDO_SORT DESC
END

This gave me the same results.

Bill
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 

Author Comment

by:danidak
Comment Utility
Well, this worked, but the run time did not improve much, around 10 sec faster compared to the original.
So, I really don't know what to do next. I'll tried many different solutions but the times just don't improve by much. If I can I would like to cut the run time at least in half.
If you have any other suggestions, I would really appreciated.

Thanks Much,
Dan
0
 
LVL 19

Expert Comment

by:grant300
Comment Utility
Yes.  Before the query block I sent you, issue the:
1> SET SHOWPLAN ON
2> SET NOEXEC ON
3> GO

Then run the query block, capture the results, and post them.

Also post the DDL for any indexes you have.

This way we can see what the optimizer is going to do without having to wait for the query to run each time.

When we get closer to where we want to be, we will turn NOEXEC off and turn on STATISITCS IO and STATISTICS TIME.

Bill
0
 
LVL 6

Expert Comment

by:ChrisKing
Comment Utility
what indexes are on the 2 tables ?
0
 

Author Comment

by:danidak
Comment Utility
Here you go:

Thanks,
Dan

-----------
For following query:

      select GETDATE()

      SELECT GL_BATCH_ID, SUBSTRING(GL_BATCH_ID,3,14) PSUEDO_SORT
        INTO #D
        FROM GL_DETAIL
       WHERE GL_ENTITY IN ('BILL','CPAY','CCAL','RCPT')
       GROUP BY GL_BATCH_ID, SUBSTRING(GL_BATCH_ID,3,14)

        SELECT D.GL_BATCH_ID
             FROM GL_SUMMARY S
             RIGHT JOIN
                  #D D
             ON S.GL_BATCH_ID = D.GL_BATCH_ID
         WHERE S.GL_BATCH_ID IS NULL
        ORDER BY  PSUEDO_SORT DESC

      select GETDATE()



------------------------------------------
--------------- 1 ------------------------
------------------------------------------

      SET SHOWPLAN ON
      SET NOEXEC ON
      GO

-----RESULT--------------
 
QUERY PLAN FOR STATEMENT 1 (at line 2).
STEP 1
The type of query is SELECT.
QUERY PLAN FOR STATEMENT 2 (at line 4).
STEP 1
The type of query is CREATE TABLE.
STEP 2
The type of query is SELECT (into Worktable1).
GROUP BY
Evaluate Grouped COUNT AGGREGATE.
FROM TABLE
GL_DETAIL
Nested iteration.
Using 4 Matching Index Scans
Index : X_GL_DETAIL_ENTITY
Forward scan.
Positioning by key.
Keys are:
GL_ENTITY  ASC
Index : X_GL_DETAIL_ENTITY
Forward scan.
Positioning by key.
Keys are:
GL_ENTITY  ASC
Index : X_GL_DETAIL_ENTITY
Forward scan.
Positioning by key.
Keys are:
GL_ENTITY  ASC
Index : X_GL_DETAIL_ENTITY
Forward scan.
Positioning by key.
Keys are:
GL_ENTITY  ASC
Using I/O Size 16 Kbytes for data pages.
With MRU Buffer Replacement Strategy for data pages.
TO TABLE
Worktable1.
STEP 3
The type of query is INSERT.
The update mode is direct.
FROM TABLE
Worktable1.
Nested iteration.
Table Scan.
Forward scan.
Positioning at start of table.
Using I/O Size 4 Kbytes for data pages.
With MRU Buffer Replacement Strategy for data pages.
TO TABLE
#D
Using I/O Size 4 Kbytes for data pages.
QUERY PLAN FOR STATEMENT 3 (at line 10).
STEP 1
The type of query is INSERT.
The update mode is direct.
Worktable1 created, in allpages locking mode, for ORDER BY.
FROM TABLE
#D
D
Nested iteration.
Table Scan.
Forward scan.
Positioning at start of table.
Using I/O Size 4 Kbytes for data pages.
With LRU Buffer Replacement Strategy for data pages.
FROM TABLE
GL_SUMMARY
S
Nested iteration.
Index : X_GL_SUMMARY_BATCH_ID
Forward scan.
Positioning by key.
Index contains all needed columns. Base table will not be read.
Keys are:
GL_BATCH_ID  ASC
Using I/O Size 2 Kbytes for index leaf pages.
With LRU Buffer Replacement Strategy for index leaf pages.
TO TABLE
Worktable1.
STEP 2
The type of query is SELECT.
This step involves sorting.
FROM TABLE
Worktable1.
Using GETSORTED
Table Scan.
Forward scan.
Positioning at start of table.
Using I/O Size 4 Kbytes for data pages.
With MRU Buffer Replacement Strategy for data pages.
QUERY PLAN FOR STATEMENT 4 (at line 17).
STEP 1
The type of query is SELECT.

0
 
LVL 6

Expert Comment

by:ChrisKing
Comment Utility
> what indexes are on the 2 tables ?
still coming ?
0
 
LVL 19

Assisted Solution

by:grant300
grant300 earned 220 total points
Comment Utility
Try a couple of things:
1)  Build an index on GL_DETAIL (ENTITY, BATCH_ID)  This is a coverage index and may nelp thing somewhat
2)  Add a step in the block between the two queries that creates an index on D# (PSUEDO_SORT) DESC

Also, how many rows are in each table, and in D#?

What type of hardware are you running on?  What is the disk configuration?  How much memory on the machine and how much have you allocated in Sybase?

Bill
0
 

Author Comment

by:danidak
Comment Utility
Well,

Index:
 DETAIL:
     GL_ENTITY
     GL_BATCH_ID
SUMMERY:
     GL_BATCH_ID
I have tried running it with with the index( GL_ENTITY,GL_BATCH_ID) but I still get the same time.

ROWS:
DETAIL 17 milion
SUMMERY 5000

Hardware/memory:
I'm not sure, but I know we have more than couple of gigs,
Hardware: top of the line: 8 months old.


-----------------------------------------------------
Some more test results
-----------------------------------------------------
------------------------------------------
--------------- 2 ------------------------
------------------------------------------

SET SHOWPLAN ON
SET NOEXEC OFF
SET STATISTICS IO ON
SET STATISTICS TIME ON
GO

-----RESULT--------------

Total writes for this command: 0
Execution Time 0.
SQL Server cpu time: 0 ms.  SQL Server elapsed time: 0 ms.
 
QUERY PLAN FOR STATEMENT 1 (at line 4).
STEP 1
The type of query is SELECT.
QUERY PLAN FOR STATEMENT 2 (at line 6).
STEP 1
The type of query is CREATE TABLE.
STEP 2
The type of query is SELECT (into Worktable1).
GROUP BY
Evaluate Grouped COUNT AGGREGATE.
FROM TABLE
GL_DETAIL
Nested iteration.
Using 4 Matching Index Scans
Index : X_GL_DETAIL_ENTITY
Forward scan.
Positioning by key.
Keys are:
GL_ENTITY  ASC
Index : X_GL_DETAIL_ENTITY
Forward scan.
Positioning by key.
Keys are:
GL_ENTITY  ASC
Index : X_GL_DETAIL_ENTITY
Forward scan.
Positioning by key.
Keys are:
GL_ENTITY  ASC
Index : X_GL_DETAIL_ENTITY
Forward scan.
Positioning by key.
Keys are:
GL_ENTITY  ASC
Using I/O Size 16 Kbytes for data pages.
With MRU Buffer Replacement Strategy for data pages.
TO TABLE
Worktable1.
STEP 3
The type of query is INSERT.
The update mode is direct.
FROM TABLE
Worktable1.
Nested iteration.
Table Scan.
Forward scan.
Positioning at start of table.
Using I/O Size 4 Kbytes for data pages.
With MRU Buffer Replacement Strategy for data pages.
TO TABLE
#D
Using I/O Size 4 Kbytes for data pages.
QUERY PLAN FOR STATEMENT 3 (at line 12).
STEP 1
The type of query is INSERT.
The update mode is direct.
Worktable1 created, in allpages locking mode, for ORDER BY.
FROM TABLE
#D
D
Nested iteration.
Table Scan.
Forward scan.
Positioning at start of table.
Using I/O Size 4 Kbytes for data pages.
With LRU Buffer Replacement Strategy for data pages.
FROM TABLE
X_GL_SUMMARY
S
Nested iteration.
Index : X_GL_SUMMARY_BATCH_ID
Forward scan.
Positioning by key.
Index contains all needed columns. Base table will not be read.
Keys are:
GL_BATCH_ID  ASC
Using I/O Size 2 Kbytes for index leaf pages.
With LRU Buffer Replacement Strategy for index leaf pages.
TO TABLE
Worktable1.
STEP 2
The type of query is SELECT.
This step involves sorting.
FROM TABLE
Worktable1.
Using GETSORTED
Table Scan.
Forward scan.
Positioning at start of table.
Using I/O Size 4 Kbytes for data pages.
With MRU Buffer Replacement Strategy for data pages.
QUERY PLAN FOR STATEMENT 4 (at line 19).
STEP 1
The type of query is SELECT.
Parse and Compile Time 1.
SQL Server cpu time: 100 ms.

------------------

9/15/2004 4:07:22.240 PM
Total writes for this command: 0
Execution Time 0.
SQL Server cpu time: 0 ms.  SQL Server elapsed time: 0 ms.
Table: X_GL_DETAIL scan count 4, logical reads: (regular=1648256 apf=100 total=1648356), physical reads: (regular=13276 apf=187648 total=200924), apf IOs used=186925
Table: #D___________00000470017412255 scan count 0, logical reads: (regular=1099 apf=0 total=1099), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Table: Worktable1  scan count 1, logical reads: (regular=2392 apf=0 total=2392), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total writes for this command: 3
Execution Time 1019.
SQL Server cpu time: 101900 ms.  SQL Server elapsed time: 1912236 ms.
The sort for Worktable1 is done in Serial  
1088 row(s) affected.

------------------

WW10000337110500
Table: X_GL_SUMMARY scan count 1088, logical reads: (regular=3501 apf=0 total=3501), physical reads: (regular=14 apf=214 total=228), apf IOs used=213
Table: #D___________00000470017412255 scan count 1, logical reads: (regular=22 apf=0 total=22), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Table: Worktable1  scan count 0, logical reads: (regular=9 apf=0 total=9), physical reads: (regular=0 apf=0 total=0), apf IOs used=0
Total writes for this command: 0
Execution Time 0.
SQL Server cpu time: 0 ms.  SQL Server elapsed time: 666 ms.


0
 

Author Comment

by:danidak
Comment Utility
  Adding the index on the temp table did not help:

 select GETDATE()

     SELECT GL_BATCH_ID, SUBSTRING(GL_BATCH_ID,3,14) PSUEDO_SORT
       INTO #D
       FROM GL_DETAIL
      WHERE GL_ENTITY IN ('BILL','CPAY','CCAL','RCPT')
      GROUP BY GL_BATCH_ID, SUBSTRING(GL_BATCH_ID,3,14)

 CREATE INDEX X_PSUEDO_SORT
    ON #D(PSUEDO_SORT)

       SELECT D.GL_BATCH_ID
           FROM GL_SUMMARY S
           RIGHT JOIN
               #D D
           ON S.GL_BATCH_ID = D.GL_BATCH_ID
        WHERE S.GL_BATCH_ID IS NULL
       ORDER BY  PSUEDO_SORT DESC

     select GETDATE()
 
0
 
LVL 19

Expert Comment

by:grant300
Comment Utility
How many records in D#?

How much memory is Sybase configured to use?

Try the index on D# on BATCH_ID.  I forgot it has to do the outer join before it can do the ORDER BY.

BIll
0
 
LVL 6

Expert Comment

by:ChrisKing
Comment Utility
use another batch - ie, put a GO after the create index - that should help
0
 
LVL 3

Expert Comment

by:knel1234
Comment Utility
Hi danidak,

Have any suggestions helped you?  If not, then could you please post the structure of the tables that are use in the query(feel free to alter column names if needed).  In addition, can you provide some idea regarding the number of rows in each table (I know there are some references to row counts but a summary at this point might help), the cardinality of the data, and the current indexes for each table.  In addition, can you provide buffer pool information as well as then sizing of tempdb.

thanks
knel
0
 
LVL 2

Assisted Solution

by:TadHawkins
TadHawkins earned 30 total points
Comment Utility
/*************

 If you really are just after the distinct values then a piecemeal approach
 may work faster as any single SQL statement is going to read through a lot
 of duplicates -- evidenced by the statistics io output that shows over 1 million
 page reads on the DETAILs table.

 Perhaps something like what follows would run a whole lot faster.

(This script runs as is and seems to produce what you are looking for from the
rather small test tabels it creates.)

 *************/

 create table DETAIL (ENTITY char(2), BATCH_ID int, D int)
 create unique clustered index a on DETAIL (ENTITY, BATCH_ID, D)
 go

 create table SUMMARY (BATCH_ID int)
 create unique clustered index b on SUMMARY (BATCH_ID)
 go

 create table #batch_id (BATCH_ID int)
 create unique clustered index x on #batch_id (BATCH_ID)
 go

 set nocount on
 go

 insert DETAIL values ("BL", 1, 9)
 insert DETAIL values ("BL", 1, 8)
 insert DETAIL values ("BL", 1, 7)
 insert DETAIL values ("BL", 1, 6)
 insert DETAIL values ("BL", 3, 1)
 insert DETAIL values ("BL", 3, 2)
 insert DETAIL values ("BL", 7, 1)

 insert DETAIL values ("CA", 1, 1)
 insert DETAIL values ("CA", 2, 1)
 insert DETAIL values ("CA", 2, 2)
 insert DETAIL values ("CA", 7, 1)

 insert SUMMARY values (2)
 go
                                                                                                       
/************                                                                                          
                                                                                                       
Loop through entity values 'BL', 'CA', 'CC', 'RC' one at a time:                                        
                                                                                                       
*************/                                                                                          
                                                                                                       
declare @E int                                                                                          
select  @E = 1                                                                                          
                                                                                                       
declare @ENTITY char(2)                                                                                
                                                                                                       
while (@E < = 4)                                                                                        
begin                                                                                                  
                                                                                                       
  select @ENTITY = case when @E = 1 then "BL"                                                          
                        when @E = 2 then "CA"                                                          
                        when @E = 3 then "CC"                                                          
                        when @E = 4 then "RC"                                                          
                   end                                                                                  
                                                                                                       
  declare @BATCH_ID      int                                                                            
  declare @PREV_BATCH_ID int                                                                            
  select  @PREV_BATCH_ID = -1                                                                          
                                                                                                       
  declare @rc int                                                                                      
                                                                                                       
/************                                                                                          
                                                                                                       
  Loop through BATCH_ID values:                                                                        
                                                                                                       
  The main select should be supported by the ENTITY, BATCH_ID index and should only need to access      
  a single row and therefore just a few pages for each invocation. Your statistics IO output seems      
  to indicate that you have on the order of 1088 or so rows to return. If that's about right then      
  this method may reduce your page IOs by a couple magnitudes.                                          
                                                                                                       
  You may want to try it with the two not exists subqueries in the other order. Could make a difference depending on which one is more likely to eliminate a row.
                                                                                                       
  You want to check the showplan output to see that the order by does not result in any actual sorting.
  The index should support the order by clause.                                                                
                                                                                                       
*************/                                                                                          
                                                                                                       
  while (1 = 1)                                                                                        
  begin                                                                                                
                                                                                                       
    set rowcount 1  
                                                                                                   
    select @BATCH_ID = BATCH_ID                                                                    
    from   DETAIL                                                                                  
    where  ENTITY = @ENTITY                                                                        
    and    BATCH_ID > @PREV_BATCH_ID                                                              
    and    not exists                                                                              
           (                                                                                      
             select *                                                                              
             from   SUMMARY                                                                        
             where  DETAIL.BATCH_ID = SUMMARY.BATCH_ID                                            
           )                                                                                      
    and    not exists                                                                              
           (                                                                                      
             select *                                                                              
             from   #batch_id                                                                      
             where  DETAIL.BATCH_ID = #batch_id.BATCH_ID                                          
           )                                                                                      
    order by ENTITY, BATCH_ID                                                                      
                                                                                                   
    select  @rc = @@rowcount                                                                      
                                                                                                   
    set rowcount 0                                                                                
                                                                                                   
    if (@rc = 0) break                                                                            
                                                                                                   
    insert #batch_id (BATCH_ID) values (@BATCH_ID)                                                
                                                                                                   
    select @PREV_BATCH_ID = @BATCH_ID                                                              
                                                                                                   
  end -- of BATCH_ID loop ...                                                                      
                                                                                                   
  SELECT @E = @E + 1                                                                              
                                                                                                   
end -- of ENTITY loop ...                                                                          
                                                                                                   
select * from #batch_id -- I made BATCH_ID an int for simplicity. You would put your order by here.
                                                                                                   
go                                                                                                                                                                                    
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

HOW TO: Connect to the VMware vSphere Hypervisor 6.5 (ESXi 6.5) using the vSphere (HTML5 Web) Host Client 6.5, and perform a simple configuration task of adding a new VMFS 6 datastore.
This article explains how to prepare an HTML email signature template file containing dynamic placeholders for users' Azure AD data. Furthermore, it explains how to use this file to remotely set up a department-wide email signature policy in Office …
It is a freely distributed piece of software for such tasks as photo retouching, image composition and image authoring. It works on many operating systems, in many languages.
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now