Solved

NOT IN Query

Posted on 2004-10-02
6
1,033 Views
Last Modified: 2008-01-09
Hi,

I have the following query:

CREATE TABLE NEW_TABLE
AS
SELECT Q1.*,
       round(MONTHS_BETWEEN(SYSDATE, prsn_birth_date)/12) AS CURR_AGE,
       Q4.BUS_ID,
       Q4.PLCMNT_BUS_STRT_DATE,
       Q4.PLCMNT_TYPE_CODE
  FROM (SELECT PLCMNT_ID,
               PRSN_ID
          FROM (SELECT PLCMNT_ID,
                       PRSN_ID,
                       PLCMNT_BUS_STRT_DATE,
                       BUS_ID,
                       LAG(BUS_ID, 1, 0) OVER(PARTITION BY PLCMNT_ID ORDER BY PLCMNT_BUS_STRT_DATE DESC) AS PREVIOUS_BUS
                  FROM PERM_EVENTS T
                 ORDER BY T.PLCMNT_ID,
                          T.PLCMNT_BUS_STRT_DATE,
                          T.BUS_ID) XL
          WHERE XL.PREVIOUS_BUS <> BUS_ID
          GROUP BY PLCMNT_ID, PRSN_ID
          HAVING COUNT(PLCMNT_BUS_STRT_DATE) > 2) Q1,
(SELECT tt.*
 FROM PLCMNT_BUS tt,
(SELECT PB.PLCMNT_ID,
        MAX(PB.PLCMNT_BUS_STRT_DATE) st_date
        FROM PLCMNT_BUS PB
        WHERE PLCMNT_TYPE_CODE NOT IN ('HO', 'OR'))
        GROUP BY PB.PLCMNT_ID) t2
 WHERE tt.plcmnt_id = t2.plcmnt_id
 AND tt.plcmnt_bus_strt_date = t2.st_date) q4,
 PERSON P
 WHERE q1.plcmnt_id = q4.plcmnt_id
 AND  Q1.PRSN_ID = P.PRSN_ID

Essentially, we are trying to find episodes (by plcmnt_id from the perm_events table) that have more 3 or more records --that do not have the same bus_id twice in a row as well as the current age of the person.  We also want the max start_date for the bus_id of the previous record if the last record has a plcmnt_type_code of 'HO' or 'OR' (We figured making a table that pulled the correct bus_id, plcmnt_id, start_date and for plcmnt_id's that met the criteria would suffice (This is the 2nd part of the process - we create the perm events table first with the events we want to see).  The "NOT IN" clause has caused the make table query to go from 3 minutes to 3 hours.  We tried using the 'NOT EXISTS' clause but it eliminated the whole group of records.  See examples:

TABLES:

PERM_EVENTS
PLCMNT_ID   BUS_ID   PLCMNT_STRT_DATE     END_DATE                PLCMNT_TYPE_CODE
10                25              1/1/04                        1/15/04                             RP
10                24              1/16/04                      1/30/04                            FH
10                90              2/1/04                                                               HO
20                25              1/1/04                         1/15/04                          FH
20                15              1/16/04                       1/30/04                          RP
20                27              2/1/04                                                              FH
30               20                2/1/04                        2/2/04                           FH
30               20                2/2/04                       2/3/04                            RP
30               25                2/3/04                                                            FH
40               20                2/1/04                        2/3/04                           FH
40               25                2/3/04                                                            FH
   
We would want the following results:

For plcmnt_id = 10 we would want bus_id 24 with a start date of 1/16/04 and a type of RP (since the last one has a type of 'HO' - which is one of the ones to keep out if it is the last in the series, we take the previous type, bus_id and start date)

For plcmnt_id = 20 we would want bus_id = 27 with a start date of 2/1/04 and a type of FH (keep as the type is not in the elimination list)

We don't want plcmnt_id = 30 because it has 2 bus_id's that are the same (in a row) and we don't want plcmnt_id = 40 because it only has 2 records

If we used the NOT EXISTS clause plcmnt_id = 10 was eliminated altogether.

Any ideas would be greatly appreciated.

thanks
0
Comment
Question by:dreamerw7
6 Comments
 
LVL 23

Expert Comment

by:seazodiac
Comment Utility
try to change NOT IN clause to this:

change from        WHERE PLCMNT_TYPE_CODE NOT IN ('HO', 'OR'))

to


       WHERE PLCMNT_TYPE_CODE <>'HO' and  PLCMNT_TYPE_CODE <>'OR'
0
 
LVL 6

Accepted Solution

by:
morphman earned 500 total points
Comment Utility
If you can re-write the query so that IN is used (ie. are there are fixed number of PLCMNT_TYPE_CODEs you can put in an IN query to stop the use of NOT IN?)

NOT IN is one of the most unperformant query clauses, as in most cases it causes a full scan. As your query is using the NOT IN within a correlated subquery, this is doubly worse as the scan is required for each record in t2.

Paste your explain plan into here, and we might be able to place an appropriate hint to help execution...
0
 

Author Comment

by:dreamerw7
Comment Utility
Question:  Would and "IN" Clause with MANY values still be faster than the not in?  I think  there are 17
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 6

Expert Comment

by:morphman
Comment Utility
Yes, it may well be faster, as inlist operators work quicker than full scans...
0
 
LVL 21

Expert Comment

by:oleggold
Comment Utility
What surely can help you is changing all "IN" operators with "EXISTS" - works much faster (binary way),change all the multiple  "IN" operators with "EXISTS" or at least try to use multiple "NOT EXISTS" instead of "NOT IN" if its lesser work for You.
Really hope it works,
OLEG
0
 
LVL 9

Expert Comment

by:konektor
Comment Utility
difference between "not in" and "not exists" is that subquery in "not in" clausule is executed for each row in master select and compared to left side of condition. u can limit amount of data executed in "not exists" clausule :

select * from master_table where not exists (select 1 from detail_table where detail_table.some_column = master_table.some_column)
is much better than
select * from master_table where some_column not in (select some_column from detail_table)

there should be an index on some_column in detail_table ...
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Suggested Solutions

Article by: Swadhin
From the Oracle SQL Reference (http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries006.htm) we are told that a join is a query that combines rows from two or more tables, views, or materialized views. This article provides a glimps…
How to Unravel a Tricky Query Introduction If you browse through the Oracle zones or any of the other database-related zones you'll come across some complicated solutions and sometimes you'll just have to wonder how anyone came up with them.  …
This video shows setup options and the basic steps and syntax for duplicating (cloning) a database from one instance to another. Examples are given for duplicating to the same machine and to different machines
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now