Solved

NOT IN Query

Posted on 2004-10-02
6
1,038 Views
Last Modified: 2008-01-09
Hi,

I have the following query:

CREATE TABLE NEW_TABLE
AS
SELECT Q1.*,
       round(MONTHS_BETWEEN(SYSDATE, prsn_birth_date)/12) AS CURR_AGE,
       Q4.BUS_ID,
       Q4.PLCMNT_BUS_STRT_DATE,
       Q4.PLCMNT_TYPE_CODE
  FROM (SELECT PLCMNT_ID,
               PRSN_ID
          FROM (SELECT PLCMNT_ID,
                       PRSN_ID,
                       PLCMNT_BUS_STRT_DATE,
                       BUS_ID,
                       LAG(BUS_ID, 1, 0) OVER(PARTITION BY PLCMNT_ID ORDER BY PLCMNT_BUS_STRT_DATE DESC) AS PREVIOUS_BUS
                  FROM PERM_EVENTS T
                 ORDER BY T.PLCMNT_ID,
                          T.PLCMNT_BUS_STRT_DATE,
                          T.BUS_ID) XL
          WHERE XL.PREVIOUS_BUS <> BUS_ID
          GROUP BY PLCMNT_ID, PRSN_ID
          HAVING COUNT(PLCMNT_BUS_STRT_DATE) > 2) Q1,
(SELECT tt.*
 FROM PLCMNT_BUS tt,
(SELECT PB.PLCMNT_ID,
        MAX(PB.PLCMNT_BUS_STRT_DATE) st_date
        FROM PLCMNT_BUS PB
        WHERE PLCMNT_TYPE_CODE NOT IN ('HO', 'OR'))
        GROUP BY PB.PLCMNT_ID) t2
 WHERE tt.plcmnt_id = t2.plcmnt_id
 AND tt.plcmnt_bus_strt_date = t2.st_date) q4,
 PERSON P
 WHERE q1.plcmnt_id = q4.plcmnt_id
 AND  Q1.PRSN_ID = P.PRSN_ID

Essentially, we are trying to find episodes (by plcmnt_id from the perm_events table) that have more 3 or more records --that do not have the same bus_id twice in a row as well as the current age of the person.  We also want the max start_date for the bus_id of the previous record if the last record has a plcmnt_type_code of 'HO' or 'OR' (We figured making a table that pulled the correct bus_id, plcmnt_id, start_date and for plcmnt_id's that met the criteria would suffice (This is the 2nd part of the process - we create the perm events table first with the events we want to see).  The "NOT IN" clause has caused the make table query to go from 3 minutes to 3 hours.  We tried using the 'NOT EXISTS' clause but it eliminated the whole group of records.  See examples:

TABLES:

PERM_EVENTS
PLCMNT_ID   BUS_ID   PLCMNT_STRT_DATE     END_DATE                PLCMNT_TYPE_CODE
10                25              1/1/04                        1/15/04                             RP
10                24              1/16/04                      1/30/04                            FH
10                90              2/1/04                                                               HO
20                25              1/1/04                         1/15/04                          FH
20                15              1/16/04                       1/30/04                          RP
20                27              2/1/04                                                              FH
30               20                2/1/04                        2/2/04                           FH
30               20                2/2/04                       2/3/04                            RP
30               25                2/3/04                                                            FH
40               20                2/1/04                        2/3/04                           FH
40               25                2/3/04                                                            FH
   
We would want the following results:

For plcmnt_id = 10 we would want bus_id 24 with a start date of 1/16/04 and a type of RP (since the last one has a type of 'HO' - which is one of the ones to keep out if it is the last in the series, we take the previous type, bus_id and start date)

For plcmnt_id = 20 we would want bus_id = 27 with a start date of 2/1/04 and a type of FH (keep as the type is not in the elimination list)

We don't want plcmnt_id = 30 because it has 2 bus_id's that are the same (in a row) and we don't want plcmnt_id = 40 because it only has 2 records

If we used the NOT EXISTS clause plcmnt_id = 10 was eliminated altogether.

Any ideas would be greatly appreciated.

thanks
0
Comment
Question by:dreamerw7
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
6 Comments
 
LVL 23

Expert Comment

by:seazodiac
ID: 12211471
try to change NOT IN clause to this:

change from        WHERE PLCMNT_TYPE_CODE NOT IN ('HO', 'OR'))

to


       WHERE PLCMNT_TYPE_CODE <>'HO' and  PLCMNT_TYPE_CODE <>'OR'
0
 
LVL 6

Accepted Solution

by:
morphman earned 500 total points
ID: 12211934
If you can re-write the query so that IN is used (ie. are there are fixed number of PLCMNT_TYPE_CODEs you can put in an IN query to stop the use of NOT IN?)

NOT IN is one of the most unperformant query clauses, as in most cases it causes a full scan. As your query is using the NOT IN within a correlated subquery, this is doubly worse as the scan is required for each record in t2.

Paste your explain plan into here, and we might be able to place an appropriate hint to help execution...
0
 

Author Comment

by:dreamerw7
ID: 12214577
Question:  Would and "IN" Clause with MANY values still be faster than the not in?  I think  there are 17
0
Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

 
LVL 6

Expert Comment

by:morphman
ID: 12215098
Yes, it may well be faster, as inlist operators work quicker than full scans...
0
 
LVL 21

Expert Comment

by:oleggold
ID: 12215509
What surely can help you is changing all "IN" operators with "EXISTS" - works much faster (binary way),change all the multiple  "IN" operators with "EXISTS" or at least try to use multiple "NOT EXISTS" instead of "NOT IN" if its lesser work for You.
Really hope it works,
OLEG
0
 
LVL 9

Expert Comment

by:konektor
ID: 12216217
difference between "not in" and "not exists" is that subquery in "not in" clausule is executed for each row in master select and compared to left side of condition. u can limit amount of data executed in "not exists" clausule :

select * from master_table where not exists (select 1 from detail_table where detail_table.some_column = master_table.some_column)
is much better than
select * from master_table where some_column not in (select some_column from detail_table)

there should be an index on some_column in detail_table ...
0

Featured Post

[Live Webinar] The Cloud Skills Gap

As Cloud technologies come of age, business leaders grapple with the impact it has on their team's skills and the gap associated with the use of a cloud platform.

Join experts from 451 Research and Concerto Cloud Services on July 27th where we will examine fact and fiction.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Configuring and using Oracle Database Gateway for ODBC Introduction First, a brief summary of what a Database Gateway is.  A Gateway is a set of driver agents and configurations that allow an Oracle database to communicate with other platforms…
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Suggested Courses

617 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question