Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

NOT IN Query

Posted on 2004-10-02
6
Medium Priority
?
1,039 Views
Last Modified: 2008-01-09
Hi,

I have the following query:

CREATE TABLE NEW_TABLE
AS
SELECT Q1.*,
       round(MONTHS_BETWEEN(SYSDATE, prsn_birth_date)/12) AS CURR_AGE,
       Q4.BUS_ID,
       Q4.PLCMNT_BUS_STRT_DATE,
       Q4.PLCMNT_TYPE_CODE
  FROM (SELECT PLCMNT_ID,
               PRSN_ID
          FROM (SELECT PLCMNT_ID,
                       PRSN_ID,
                       PLCMNT_BUS_STRT_DATE,
                       BUS_ID,
                       LAG(BUS_ID, 1, 0) OVER(PARTITION BY PLCMNT_ID ORDER BY PLCMNT_BUS_STRT_DATE DESC) AS PREVIOUS_BUS
                  FROM PERM_EVENTS T
                 ORDER BY T.PLCMNT_ID,
                          T.PLCMNT_BUS_STRT_DATE,
                          T.BUS_ID) XL
          WHERE XL.PREVIOUS_BUS <> BUS_ID
          GROUP BY PLCMNT_ID, PRSN_ID
          HAVING COUNT(PLCMNT_BUS_STRT_DATE) > 2) Q1,
(SELECT tt.*
 FROM PLCMNT_BUS tt,
(SELECT PB.PLCMNT_ID,
        MAX(PB.PLCMNT_BUS_STRT_DATE) st_date
        FROM PLCMNT_BUS PB
        WHERE PLCMNT_TYPE_CODE NOT IN ('HO', 'OR'))
        GROUP BY PB.PLCMNT_ID) t2
 WHERE tt.plcmnt_id = t2.plcmnt_id
 AND tt.plcmnt_bus_strt_date = t2.st_date) q4,
 PERSON P
 WHERE q1.plcmnt_id = q4.plcmnt_id
 AND  Q1.PRSN_ID = P.PRSN_ID

Essentially, we are trying to find episodes (by plcmnt_id from the perm_events table) that have more 3 or more records --that do not have the same bus_id twice in a row as well as the current age of the person.  We also want the max start_date for the bus_id of the previous record if the last record has a plcmnt_type_code of 'HO' or 'OR' (We figured making a table that pulled the correct bus_id, plcmnt_id, start_date and for plcmnt_id's that met the criteria would suffice (This is the 2nd part of the process - we create the perm events table first with the events we want to see).  The "NOT IN" clause has caused the make table query to go from 3 minutes to 3 hours.  We tried using the 'NOT EXISTS' clause but it eliminated the whole group of records.  See examples:

TABLES:

PERM_EVENTS
PLCMNT_ID   BUS_ID   PLCMNT_STRT_DATE     END_DATE                PLCMNT_TYPE_CODE
10                25              1/1/04                        1/15/04                             RP
10                24              1/16/04                      1/30/04                            FH
10                90              2/1/04                                                               HO
20                25              1/1/04                         1/15/04                          FH
20                15              1/16/04                       1/30/04                          RP
20                27              2/1/04                                                              FH
30               20                2/1/04                        2/2/04                           FH
30               20                2/2/04                       2/3/04                            RP
30               25                2/3/04                                                            FH
40               20                2/1/04                        2/3/04                           FH
40               25                2/3/04                                                            FH
   
We would want the following results:

For plcmnt_id = 10 we would want bus_id 24 with a start date of 1/16/04 and a type of RP (since the last one has a type of 'HO' - which is one of the ones to keep out if it is the last in the series, we take the previous type, bus_id and start date)

For plcmnt_id = 20 we would want bus_id = 27 with a start date of 2/1/04 and a type of FH (keep as the type is not in the elimination list)

We don't want plcmnt_id = 30 because it has 2 bus_id's that are the same (in a row) and we don't want plcmnt_id = 40 because it only has 2 records

If we used the NOT EXISTS clause plcmnt_id = 10 was eliminated altogether.

Any ideas would be greatly appreciated.

thanks
0
Comment
Question by:dreamerw7
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 23

Expert Comment

by:seazodiac
ID: 12211471
try to change NOT IN clause to this:

change from        WHERE PLCMNT_TYPE_CODE NOT IN ('HO', 'OR'))

to


       WHERE PLCMNT_TYPE_CODE <>'HO' and  PLCMNT_TYPE_CODE <>'OR'
0
 
LVL 6

Accepted Solution

by:
morphman earned 2000 total points
ID: 12211934
If you can re-write the query so that IN is used (ie. are there are fixed number of PLCMNT_TYPE_CODEs you can put in an IN query to stop the use of NOT IN?)

NOT IN is one of the most unperformant query clauses, as in most cases it causes a full scan. As your query is using the NOT IN within a correlated subquery, this is doubly worse as the scan is required for each record in t2.

Paste your explain plan into here, and we might be able to place an appropriate hint to help execution...
0
 

Author Comment

by:dreamerw7
ID: 12214577
Question:  Would and "IN" Clause with MANY values still be faster than the not in?  I think  there are 17
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 
LVL 6

Expert Comment

by:morphman
ID: 12215098
Yes, it may well be faster, as inlist operators work quicker than full scans...
0
 
LVL 21

Expert Comment

by:oleggold
ID: 12215509
What surely can help you is changing all "IN" operators with "EXISTS" - works much faster (binary way),change all the multiple  "IN" operators with "EXISTS" or at least try to use multiple "NOT EXISTS" instead of "NOT IN" if its lesser work for You.
Really hope it works,
OLEG
0
 
LVL 9

Expert Comment

by:konektor
ID: 12216217
difference between "not in" and "not exists" is that subquery in "not in" clausule is executed for each row in master select and compared to left side of condition. u can limit amount of data executed in "not exists" clausule :

select * from master_table where not exists (select 1 from detail_table where detail_table.some_column = master_table.some_column)
is much better than
select * from master_table where some_column not in (select some_column from detail_table)

there should be an index on some_column in detail_table ...
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Working with Network Access Control Lists in Oracle 11g (part 2) Part 1: http://www.e-e.com/A_8429.html Previously, I introduced the basics of network ACL's including how to create, delete and modify entries to allow and deny access.  For many…
Shell script to create broker configuration file using current broker Configuration, solely for purpose of backup on Linux. Script may need to be modified depending on OS-installation. Please deploy and verify the script in a test environment.
This video shows information on the Oracle Data Dictionary, starting with the Oracle documentation, explaining the different types of Data Dictionary views available by group and permissions as well as giving examples on how to retrieve data from th…
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.

704 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question