• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1776
  • Last Modified:

inner join vs. select where in

Just curious:

Is there a difference in the way Access handles inner joins and nested queries?
If so, does this difference influence performance?

Example:

Inner Join:

SELECT tbl_Product.Product_ID FROM tlu_ProductType INNER JOIN tbl_Product ON tlu_ProductType.ProductType_ID = tbl_Product.ProductTypeID WHERE tlu_ProductType.ProductType="Game";

Nested query:

SELECT tbl_Product.Product_ID FROM  tbl_Product WHERE tbl_Product.Product_ID IN (SELECT tlu_ProductType.ProductType_ID FROM tlu_ProductType WHERE tlu_ProductType.ProductType="Game")
0
pietjepuk
Asked:
pietjepuk
  • 4
  • 3
  • 2
  • +3
1 Solution
 
simonbennettCommented:
My petty conjecture...

A nested query explicits requires 2 SQL processes on the DB. I would be suprised if the Join syntax simply masked this - more likely it will only need to complete 1 pass.

Therefore I vote Join is faster than Nested.

Simon
0
 
paaskyCommented:
Hello pietjepuk,

I usually use EXISTS instead of JOIN / IN (reason might be my Oracle background):

SELECT P.Product_ID
FROM   tbl_Product AS P
WHERE EXISTS
  (SELECT 'x'
   FROM tbl_ProductType AS T
   WHERE T.ProductType = "Game"
   AND   T.ProductTypeID = P.ProductTypeID);

This is very efficient method of getting data when the table in subquery is large and it's indexed properly.

Regards,
Paasky
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
INNER JOIN will generally be faster, for individual queries you might determine the Execution Plan to check if it is true (only if the query runs slow)
IN will do the subselect once (if there is no correlation) and for every row (when the subquery is correlated), which may result in faster execution if subquery returns very few rows
EXISTS is similar to an correlated IN, but a little bit faster, as the subquery stops whenever a row is found)
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
NOTE:
if you want to use NOT (either for exists, in ..) then you may choose the following option:

Original QUERY:
SELECT * FROM CLIENTS
WHERE ClientID NOT IN ( SELECT ClientID FROM ORDERS )
New Query Batch (use a stored procedure):
CREATE TABLE #tmp ( CLIENTID INT )
INSERT #tmp SELECT CLIENTID FROM ORDERS
DELETE #tmp
FROM #tmp JOIN ORDERS ON ORDERS.ClientID = #tmp.ClientID
SELECT CLIENTS.*
FROM CLIENTS JOIN #tmp ON CLIENTS.ClientID = #tmp.ClientID
DROP TABLE #tmp

These are more statements, and you may still add some slight improvements, but basically this is the best option WHEN NEEDING TO DO  "NOT IN"
0
 
BelieverCommented:
In my experience, what you refer to as a nested query is tons slower than the join.  It wouldn't surprise me, though, if my past experience was due to lack of proper indexing, as paasky suggests.  I, too, vote for the inner join.

(Hey everybody, help me out here... I'd call pietjepuk's 2nd example a "subselect."  To me a nested query is a (saved) query within a query.)
0
 
paaskyCommented:
One more addition to my comment.

My suggestion for using EXISTS is for cases where there's more than one ProductTypeName = "Game" record in tbl_ProductType and you want to select records from tbl_Product which belong in any "Game" type product group.

For eg. I might have these records in tbl_product type:

ProductTypeID   ProductTypeName  ProductTypeClass
1               Game             Adventure
2               Game             RGB
3               Game             Simulation
....

If there's only one "Game" record in product type table, then - of course - I would use JOIN because it's faster in that case and certainly not IN or EXISTS.

For eg.

SELECT P.Product_ID
FROM   tbl_Product AS P,
       tbl_ProductType AS T
WHERE  T.ProductType = "Game"
AND    T.ProductTypeID = P.ProductTypeID;

Paasky
0
 
pietjepukAuthor Commented:
Thanks for the quick responses.

The subquery returns 1 record from tlu_ProductType. tlu_ProductType is a small table and tbl_Product is a (relatively) large table. Small and large refers to both the number of fields and the amount of data.

angelIII and paasky, if I understand you both correctly your comments claim the opposite. According to angelIII's first comment SELECT IN might perform better than INNER JOIN if the subquery returns few rows, and EXISTS performs better than a correlated SELECT IN. This means that in this specific situation EXISTS might prove to be my best option? Paasky advises to use EXISTS only if the subquery returns many (more than one) rows. A bit confusing...

Believer, I think a nested query is a query that contains subqueries.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
I give the reason why a SELECT IN performs better when the subquery returns fewer rows and is not correlated:
It executes only once and the rows from the main select are compared in a very fast way agains this values.
The JOIN however will produce the full join (as the name says), which will add (maybe) temporary data to all the rows. This data has to be handled, which cost more time that comparing values.
EXISTS should only be used when checking agains correlated subqueries.

BTW: SELECT IN and JOIN may produce a different result! SELECT IN will produce 1 result row per row in the main table, whereas JOIN produces the rows depending on the JOIN clause.
0
 
BrianWrenCommented:
Paasky,

using

 . . . WHERE EXISTS
  (SELECT 'x' . . .

Do the records in the parens show up in the datasheet, and can this query/or those fields be edited and saved?

Brian
   
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
BrianWren: Yes
0
 
paaskyCommented:
angelIII, I agree with you.

small resultset -> IN has better performance
big resultset -> EXISTS has better performance

I can tell one example case about an SQL script I made for my customer which had many SELECT - IN clauses. It worked very fast during first year, but after that the performance was decreasing dramatically because the table in subquery had more and more records. In the end of last year the execution was taking hours and I noticed that by mistake when I was checking database Y2K compatibily. I changed all SELECT - IN to SELECT - EXISTS and the it took just couple of seconds to run whole script.

Regards,
paasky
0
 
pietjepukAuthor Commented:
OK, thank you all.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Introducing Cloud Class® training courses

Tech changes fast. You can learn faster. That’s why we’re bringing professional training courses to Experts Exchange. With a subscription, you can access all the Cloud Class® courses to expand your education, prep for certifications, and get top-notch instructions.

  • 4
  • 3
  • 2
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now