Link to home
Start Free TrialLog in
Avatar of BCRobert
BCRobert

asked on

Convert NOT IN Sub-Select to LEFT JOIN

When I run this query:

select distinct idQuestion from Question
where parentid is null AND idQuestion not in (select distinct parentid from Question where parentid is not null)
order by idQuestion

Open in new window


The query never finishes after waiting several minutes. I've researched the issue and the conclusion is to use a LEFT JOIN instead of a sub-select. However, I'm not sure how to properly construct the query with these limitations.
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

please try this:

select q.idQuestion 
from Question q
where q.parentid is null 
AND NOT EXISTS ( select NULL from Question o where o.parentid = q.idQuestion )
order by idQuestion 

Open in new window


and please ensure you have an index on parentid
Hi,

I mean you have little nonsense in your query, becouse you compare idQuestion with parentid

The correct query should only be:
select distinct idQuestion
from Question
where parentid is not null
order by idQuestion

Maybe, you need get something else. So try to explain what shloud be result of this query.
Avatar of BCRobert
BCRobert

ASKER

@angellll: parentid isn't indexed because it's not unique. This query is taking just as long.

@eridanix: That query won't work. Here's what I'm trying to find:

Find the question IDs that are not a parent ID of another question and the parentid is null.

idQuestion    |    data     |    parentid
123               |    abc      |       NULL
321               |    def       |        123
444               |    qqq       |    NULL

Open in new window


The query should return just '444' as its parentid is null AND is not a parentid of another question.
select distinct idQuestion
from Question
where parentid is null and idQuestion NOT IN (select idQuestion from Question where parentid is not null)
order by idQuestion
ASKER CERTIFIED SOLUTION
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
edit: you might not know the a primary key is a UNIQUE index, under the hood, but you can created non-unique indexes.
actually, a index is non-unique by default, unless you specify it to add the restriction that each value can only be present once.
@eridanix: That will not return the results I need. I need to make sure the the idQuestion IS NOT the parentid of another question.
@angellll: The parentid is already an index, apparently (I didn't create the database/tables):

parentIdIdx      BTREE      No      No      parentId      41002      A      YES      
select idQuestion
from dbo.Questions
where idQuestion IN (select idQuestion from Question where parentid is null) AND idQuestion NOT IN  (select parentid from Question where parentid is not null)
@angellll: It seems that re-creating the index helped tremendously. Not sure why, but the query finished after about 45 seconds, which is still slow, but it's not a query we'll be running often.
Creating/recreating the index helped the initial query finished. Converting it to LEFT JOIN/NOT IN isn't needed.
please try to remove the 2 DISTINCT in the query.

IN ( SELECT DISTINCT  ... ) will be the same, but slower normally, as
IN ( SELECT ... )

the NOT EXISTS () version I posted should be fastest ...