Solved

inner join vs. select where in

Posted on 2000-03-17
12
1,738 Views
Last Modified: 2008-03-10
Just curious:

Is there a difference in the way Access handles inner joins and nested queries?
If so, does this difference influence performance?

Example:

Inner Join:

SELECT tbl_Product.Product_ID FROM tlu_ProductType INNER JOIN tbl_Product ON tlu_ProductType.ProductType_ID = tbl_Product.ProductTypeID WHERE tlu_ProductType.ProductType="Game";

Nested query:

SELECT tbl_Product.Product_ID FROM  tbl_Product WHERE tbl_Product.Product_ID IN (SELECT tlu_ProductType.ProductType_ID FROM tlu_ProductType WHERE tlu_ProductType.ProductType="Game")
0
Comment
Question by:pietjepuk
  • 4
  • 3
  • 2
  • +3
12 Comments
 
LVL 6

Expert Comment

by:simonbennett
ID: 2627255
My petty conjecture...

A nested query explicits requires 2 SQL processes on the DB. I would be suprised if the Join syntax simply masked this - more likely it will only need to complete 1 pass.

Therefore I vote Join is faster than Nested.

Simon
0
 
LVL 10

Expert Comment

by:paasky
ID: 2627374
Hello pietjepuk,

I usually use EXISTS instead of JOIN / IN (reason might be my Oracle background):

SELECT P.Product_ID
FROM   tbl_Product AS P
WHERE EXISTS
  (SELECT 'x'
   FROM tbl_ProductType AS T
   WHERE T.ProductType = "Game"
   AND   T.ProductTypeID = P.ProductTypeID);

This is very efficient method of getting data when the table in subquery is large and it's indexed properly.

Regards,
Paasky
0
 
LVL 142

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 100 total points
ID: 2627544
INNER JOIN will generally be faster, for individual queries you might determine the Execution Plan to check if it is true (only if the query runs slow)
IN will do the subselect once (if there is no correlation) and for every row (when the subquery is correlated), which may result in faster execution if subquery returns very few rows
EXISTS is similar to an correlated IN, but a little bit faster, as the subquery stops whenever a row is found)
0
Ransomware: The New Cyber Threat & How to Stop It

This infographic explains ransomware, type of malware that blocks access to your files or your systems and holds them hostage until a ransom is paid. It also examines the different types of ransomware and explains what you can do to thwart this sinister online threat.  

 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 2627557
NOTE:
if you want to use NOT (either for exists, in ..) then you may choose the following option:

Original QUERY:
SELECT * FROM CLIENTS
WHERE ClientID NOT IN ( SELECT ClientID FROM ORDERS )
New Query Batch (use a stored procedure):
CREATE TABLE #tmp ( CLIENTID INT )
INSERT #tmp SELECT CLIENTID FROM ORDERS
DELETE #tmp
FROM #tmp JOIN ORDERS ON ORDERS.ClientID = #tmp.ClientID
SELECT CLIENTS.*
FROM CLIENTS JOIN #tmp ON CLIENTS.ClientID = #tmp.ClientID
DROP TABLE #tmp

These are more statements, and you may still add some slight improvements, but basically this is the best option WHEN NEEDING TO DO  "NOT IN"
0
 
LVL 7

Expert Comment

by:Believer
ID: 2627789
In my experience, what you refer to as a nested query is tons slower than the join.  It wouldn't surprise me, though, if my past experience was due to lack of proper indexing, as paasky suggests.  I, too, vote for the inner join.

(Hey everybody, help me out here... I'd call pietjepuk's 2nd example a "subselect."  To me a nested query is a (saved) query within a query.)
0
 
LVL 10

Expert Comment

by:paasky
ID: 2627907
One more addition to my comment.

My suggestion for using EXISTS is for cases where there's more than one ProductTypeName = "Game" record in tbl_ProductType and you want to select records from tbl_Product which belong in any "Game" type product group.

For eg. I might have these records in tbl_product type:

ProductTypeID   ProductTypeName  ProductTypeClass
1               Game             Adventure
2               Game             RGB
3               Game             Simulation
....

If there's only one "Game" record in product type table, then - of course - I would use JOIN because it's faster in that case and certainly not IN or EXISTS.

For eg.

SELECT P.Product_ID
FROM   tbl_Product AS P,
       tbl_ProductType AS T
WHERE  T.ProductType = "Game"
AND    T.ProductTypeID = P.ProductTypeID;

Paasky
0
 
LVL 1

Author Comment

by:pietjepuk
ID: 2628056
Thanks for the quick responses.

The subquery returns 1 record from tlu_ProductType. tlu_ProductType is a small table and tbl_Product is a (relatively) large table. Small and large refers to both the number of fields and the amount of data.

angelIII and paasky, if I understand you both correctly your comments claim the opposite. According to angelIII's first comment SELECT IN might perform better than INNER JOIN if the subquery returns few rows, and EXISTS performs better than a correlated SELECT IN. This means that in this specific situation EXISTS might prove to be my best option? Paasky advises to use EXISTS only if the subquery returns many (more than one) rows. A bit confusing...

Believer, I think a nested query is a query that contains subqueries.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 2628216
I give the reason why a SELECT IN performs better when the subquery returns fewer rows and is not correlated:
It executes only once and the rows from the main select are compared in a very fast way agains this values.
The JOIN however will produce the full join (as the name says), which will add (maybe) temporary data to all the rows. This data has to be handled, which cost more time that comparing values.
EXISTS should only be used when checking agains correlated subqueries.

BTW: SELECT IN and JOIN may produce a different result! SELECT IN will produce 1 result row per row in the main table, whereas JOIN produces the rows depending on the JOIN clause.
0
 
LVL 9

Expert Comment

by:BrianWren
ID: 2628496
Paasky,

using

 . . . WHERE EXISTS
  (SELECT 'x' . . .

Do the records in the parens show up in the datasheet, and can this query/or those fields be edited and saved?

Brian
   
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 2628528
BrianWren: Yes
0
 
LVL 10

Expert Comment

by:paasky
ID: 2628662
angelIII, I agree with you.

small resultset -> IN has better performance
big resultset -> EXISTS has better performance

I can tell one example case about an SQL script I made for my customer which had many SELECT - IN clauses. It worked very fast during first year, but after that the performance was decreasing dramatically because the table in subquery had more and more records. In the end of last year the execution was taking hours and I noticed that by mistake when I was checking database Y2K compatibily. I changed all SELECT - IN to SELECT - EXISTS and the it took just couple of seconds to run whole script.

Regards,
paasky
0
 
LVL 1

Author Comment

by:pietjepuk
ID: 2635231
OK, thank you all.
0

Featured Post

The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction When developing Access applications, often we need to know whether an object exists.  This article presents a quick and reliable routine to determine if an object exists without that object being opened. If you wanted to inspect/ite…
Introduction The Visual Basic for Applications (VBA) language is at the heart of every application that you write. It is your key to taking Access beyond the world of wizards into a world where anything is possible. This article introduces you to…
In Microsoft Access, learn how to “cascade” or have the displayed data of one combo control depend upon what’s entered in another. Base the dependent combo on a query for its row source: Add a reference to the first combo on the form as criteria i…
With Microsoft Access, learn how to start a database in different ways and produce different start-up actions allowing you to use a single database to perform multiple tasks. Specify a start-up form through options: Specify an Autoexec macro: Us…

831 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question