Solved

inner join vs. select where in

Posted on 2000-03-17
12
1,744 Views
Last Modified: 2008-03-10
Just curious:

Is there a difference in the way Access handles inner joins and nested queries?
If so, does this difference influence performance?

Example:

Inner Join:

SELECT tbl_Product.Product_ID FROM tlu_ProductType INNER JOIN tbl_Product ON tlu_ProductType.ProductType_ID = tbl_Product.ProductTypeID WHERE tlu_ProductType.ProductType="Game";

Nested query:

SELECT tbl_Product.Product_ID FROM  tbl_Product WHERE tbl_Product.Product_ID IN (SELECT tlu_ProductType.ProductType_ID FROM tlu_ProductType WHERE tlu_ProductType.ProductType="Game")
0
Comment
Question by:pietjepuk
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 2
  • +3
12 Comments
 
LVL 6

Expert Comment

by:simonbennett
ID: 2627255
My petty conjecture...

A nested query explicits requires 2 SQL processes on the DB. I would be suprised if the Join syntax simply masked this - more likely it will only need to complete 1 pass.

Therefore I vote Join is faster than Nested.

Simon
0
 
LVL 10

Expert Comment

by:paasky
ID: 2627374
Hello pietjepuk,

I usually use EXISTS instead of JOIN / IN (reason might be my Oracle background):

SELECT P.Product_ID
FROM   tbl_Product AS P
WHERE EXISTS
  (SELECT 'x'
   FROM tbl_ProductType AS T
   WHERE T.ProductType = "Game"
   AND   T.ProductTypeID = P.ProductTypeID);

This is very efficient method of getting data when the table in subquery is large and it's indexed properly.

Regards,
Paasky
0
 
LVL 143

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 100 total points
ID: 2627544
INNER JOIN will generally be faster, for individual queries you might determine the Execution Plan to check if it is true (only if the query runs slow)
IN will do the subselect once (if there is no correlation) and for every row (when the subquery is correlated), which may result in faster execution if subquery returns very few rows
EXISTS is similar to an correlated IN, but a little bit faster, as the subquery stops whenever a row is found)
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 2627557
NOTE:
if you want to use NOT (either for exists, in ..) then you may choose the following option:

Original QUERY:
SELECT * FROM CLIENTS
WHERE ClientID NOT IN ( SELECT ClientID FROM ORDERS )
New Query Batch (use a stored procedure):
CREATE TABLE #tmp ( CLIENTID INT )
INSERT #tmp SELECT CLIENTID FROM ORDERS
DELETE #tmp
FROM #tmp JOIN ORDERS ON ORDERS.ClientID = #tmp.ClientID
SELECT CLIENTS.*
FROM CLIENTS JOIN #tmp ON CLIENTS.ClientID = #tmp.ClientID
DROP TABLE #tmp

These are more statements, and you may still add some slight improvements, but basically this is the best option WHEN NEEDING TO DO  "NOT IN"
0
 
LVL 7

Expert Comment

by:Believer
ID: 2627789
In my experience, what you refer to as a nested query is tons slower than the join.  It wouldn't surprise me, though, if my past experience was due to lack of proper indexing, as paasky suggests.  I, too, vote for the inner join.

(Hey everybody, help me out here... I'd call pietjepuk's 2nd example a "subselect."  To me a nested query is a (saved) query within a query.)
0
 
LVL 10

Expert Comment

by:paasky
ID: 2627907
One more addition to my comment.

My suggestion for using EXISTS is for cases where there's more than one ProductTypeName = "Game" record in tbl_ProductType and you want to select records from tbl_Product which belong in any "Game" type product group.

For eg. I might have these records in tbl_product type:

ProductTypeID   ProductTypeName  ProductTypeClass
1               Game             Adventure
2               Game             RGB
3               Game             Simulation
....

If there's only one "Game" record in product type table, then - of course - I would use JOIN because it's faster in that case and certainly not IN or EXISTS.

For eg.

SELECT P.Product_ID
FROM   tbl_Product AS P,
       tbl_ProductType AS T
WHERE  T.ProductType = "Game"
AND    T.ProductTypeID = P.ProductTypeID;

Paasky
0
 
LVL 1

Author Comment

by:pietjepuk
ID: 2628056
Thanks for the quick responses.

The subquery returns 1 record from tlu_ProductType. tlu_ProductType is a small table and tbl_Product is a (relatively) large table. Small and large refers to both the number of fields and the amount of data.

angelIII and paasky, if I understand you both correctly your comments claim the opposite. According to angelIII's first comment SELECT IN might perform better than INNER JOIN if the subquery returns few rows, and EXISTS performs better than a correlated SELECT IN. This means that in this specific situation EXISTS might prove to be my best option? Paasky advises to use EXISTS only if the subquery returns many (more than one) rows. A bit confusing...

Believer, I think a nested query is a query that contains subqueries.
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 2628216
I give the reason why a SELECT IN performs better when the subquery returns fewer rows and is not correlated:
It executes only once and the rows from the main select are compared in a very fast way agains this values.
The JOIN however will produce the full join (as the name says), which will add (maybe) temporary data to all the rows. This data has to be handled, which cost more time that comparing values.
EXISTS should only be used when checking agains correlated subqueries.

BTW: SELECT IN and JOIN may produce a different result! SELECT IN will produce 1 result row per row in the main table, whereas JOIN produces the rows depending on the JOIN clause.
0
 
LVL 9

Expert Comment

by:BrianWren
ID: 2628496
Paasky,

using

 . . . WHERE EXISTS
  (SELECT 'x' . . .

Do the records in the parens show up in the datasheet, and can this query/or those fields be edited and saved?

Brian
   
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 2628528
BrianWren: Yes
0
 
LVL 10

Expert Comment

by:paasky
ID: 2628662
angelIII, I agree with you.

small resultset -> IN has better performance
big resultset -> EXISTS has better performance

I can tell one example case about an SQL script I made for my customer which had many SELECT - IN clauses. It worked very fast during first year, but after that the performance was decreasing dramatically because the table in subquery had more and more records. In the end of last year the execution was taking hours and I noticed that by mistake when I was checking database Y2K compatibily. I changed all SELECT - IN to SELECT - EXISTS and the it took just couple of seconds to run whole script.

Regards,
paasky
0
 
LVL 1

Author Comment

by:pietjepuk
ID: 2635231
OK, thank you all.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Traditionally, the method to display pictures in Access forms and reports is to first download them from URLs to a folder, record the path in a table and then let the form or report pull the pictures from that folder. But why not let Windows retr…
This article shows how to get a list of available printers for display in a drop-down list, and then to use the selected printer to print an Access report or a Word document filled with Access data, using different syntax as needed for working with …
In Microsoft Access, learn how to “cascade” or have the displayed data of one combo control depend upon what’s entered in another. Base the dependent combo on a query for its row source: Add a reference to the first combo on the form as criteria i…
Using Microsoft Access, learn some simple rules for how to construct tables in a relational database. Split up all multi-value fields into single values: Split up fields that belong to other things into separate tables: Make sure that all record…

624 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question