Solved

NOT IN Query

Posted on 2004-10-02
6
1,036 Views
Last Modified: 2008-01-09
Hi,

I have the following query:

CREATE TABLE NEW_TABLE
AS
SELECT Q1.*,
       round(MONTHS_BETWEEN(SYSDATE, prsn_birth_date)/12) AS CURR_AGE,
       Q4.BUS_ID,
       Q4.PLCMNT_BUS_STRT_DATE,
       Q4.PLCMNT_TYPE_CODE
  FROM (SELECT PLCMNT_ID,
               PRSN_ID
          FROM (SELECT PLCMNT_ID,
                       PRSN_ID,
                       PLCMNT_BUS_STRT_DATE,
                       BUS_ID,
                       LAG(BUS_ID, 1, 0) OVER(PARTITION BY PLCMNT_ID ORDER BY PLCMNT_BUS_STRT_DATE DESC) AS PREVIOUS_BUS
                  FROM PERM_EVENTS T
                 ORDER BY T.PLCMNT_ID,
                          T.PLCMNT_BUS_STRT_DATE,
                          T.BUS_ID) XL
          WHERE XL.PREVIOUS_BUS <> BUS_ID
          GROUP BY PLCMNT_ID, PRSN_ID
          HAVING COUNT(PLCMNT_BUS_STRT_DATE) > 2) Q1,
(SELECT tt.*
 FROM PLCMNT_BUS tt,
(SELECT PB.PLCMNT_ID,
        MAX(PB.PLCMNT_BUS_STRT_DATE) st_date
        FROM PLCMNT_BUS PB
        WHERE PLCMNT_TYPE_CODE NOT IN ('HO', 'OR'))
        GROUP BY PB.PLCMNT_ID) t2
 WHERE tt.plcmnt_id = t2.plcmnt_id
 AND tt.plcmnt_bus_strt_date = t2.st_date) q4,
 PERSON P
 WHERE q1.plcmnt_id = q4.plcmnt_id
 AND  Q1.PRSN_ID = P.PRSN_ID

Essentially, we are trying to find episodes (by plcmnt_id from the perm_events table) that have more 3 or more records --that do not have the same bus_id twice in a row as well as the current age of the person.  We also want the max start_date for the bus_id of the previous record if the last record has a plcmnt_type_code of 'HO' or 'OR' (We figured making a table that pulled the correct bus_id, plcmnt_id, start_date and for plcmnt_id's that met the criteria would suffice (This is the 2nd part of the process - we create the perm events table first with the events we want to see).  The "NOT IN" clause has caused the make table query to go from 3 minutes to 3 hours.  We tried using the 'NOT EXISTS' clause but it eliminated the whole group of records.  See examples:

TABLES:

PERM_EVENTS
PLCMNT_ID   BUS_ID   PLCMNT_STRT_DATE     END_DATE                PLCMNT_TYPE_CODE
10                25              1/1/04                        1/15/04                             RP
10                24              1/16/04                      1/30/04                            FH
10                90              2/1/04                                                               HO
20                25              1/1/04                         1/15/04                          FH
20                15              1/16/04                       1/30/04                          RP
20                27              2/1/04                                                              FH
30               20                2/1/04                        2/2/04                           FH
30               20                2/2/04                       2/3/04                            RP
30               25                2/3/04                                                            FH
40               20                2/1/04                        2/3/04                           FH
40               25                2/3/04                                                            FH
   
We would want the following results:

For plcmnt_id = 10 we would want bus_id 24 with a start date of 1/16/04 and a type of RP (since the last one has a type of 'HO' - which is one of the ones to keep out if it is the last in the series, we take the previous type, bus_id and start date)

For plcmnt_id = 20 we would want bus_id = 27 with a start date of 2/1/04 and a type of FH (keep as the type is not in the elimination list)

We don't want plcmnt_id = 30 because it has 2 bus_id's that are the same (in a row) and we don't want plcmnt_id = 40 because it only has 2 records

If we used the NOT EXISTS clause plcmnt_id = 10 was eliminated altogether.

Any ideas would be greatly appreciated.

thanks
0
Comment
Question by:dreamerw7
6 Comments
 
LVL 23

Expert Comment

by:seazodiac
ID: 12211471
try to change NOT IN clause to this:

change from        WHERE PLCMNT_TYPE_CODE NOT IN ('HO', 'OR'))

to


       WHERE PLCMNT_TYPE_CODE <>'HO' and  PLCMNT_TYPE_CODE <>'OR'
0
 
LVL 6

Accepted Solution

by:
morphman earned 500 total points
ID: 12211934
If you can re-write the query so that IN is used (ie. are there are fixed number of PLCMNT_TYPE_CODEs you can put in an IN query to stop the use of NOT IN?)

NOT IN is one of the most unperformant query clauses, as in most cases it causes a full scan. As your query is using the NOT IN within a correlated subquery, this is doubly worse as the scan is required for each record in t2.

Paste your explain plan into here, and we might be able to place an appropriate hint to help execution...
0
 

Author Comment

by:dreamerw7
ID: 12214577
Question:  Would and "IN" Clause with MANY values still be faster than the not in?  I think  there are 17
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 6

Expert Comment

by:morphman
ID: 12215098
Yes, it may well be faster, as inlist operators work quicker than full scans...
0
 
LVL 21

Expert Comment

by:oleggold
ID: 12215509
What surely can help you is changing all "IN" operators with "EXISTS" - works much faster (binary way),change all the multiple  "IN" operators with "EXISTS" or at least try to use multiple "NOT EXISTS" instead of "NOT IN" if its lesser work for You.
Really hope it works,
OLEG
0
 
LVL 9

Expert Comment

by:konektor
ID: 12216217
difference between "not in" and "not exists" is that subquery in "not in" clausule is executed for each row in master select and compared to left side of condition. u can limit amount of data executed in "not exists" clausule :

select * from master_table where not exists (select 1 from detail_table where detail_table.some_column = master_table.some_column)
is much better than
select * from master_table where some_column not in (select some_column from detail_table)

there should be an index on some_column in detail_table ...
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

How to Create User-Defined Aggregates in Oracle Before we begin creating these things, what are user-defined aggregates?  They are a feature introduced in Oracle 9i that allows a developer to create his or her own functions like "SUM", "AVG", and…
Have you ever had to make fundamental changes to a table in Oracle, but haven't been able to get any downtime?  I'm talking things like: * Dropping columns * Shrinking allocated space * Removing chained blocks and restoring the PCTFREE * Re-or…
This video shows how to Export data from an Oracle database using the Datapump Export Utility.  The corresponding Datapump Import utility is also discussed and demonstrated.
This video shows how to Export data from an Oracle database using the Original Export Utility.  The corresponding Import utility, which works the same way is referenced, but not demonstrated.

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question