Solved

understanding single sweep multi join

Posted on 2009-05-07
17
797 Views
Last Modified: 2012-05-06
Hi

The MySQL manual says the following about the singl sweep multi join method.

"MySQL resolves all joins using a single-sweep multi-join method. This means that MySQL reads a row from the first table, and then finds a matching row in the second table, the third table, and so on. When all tables are processed, MySQL outputs the selected columns and backtracks through the table list until a table is found for which there are more matching rows. The next row is read from this table and the process continues with the next table."

I would like to check I am understanding this. Take 3 tables, Table A, B and C all of which have 2 columns ID and col1. Then take the following SQL statement

Select * from table A join tableB on tableA.col1 = TableB.Col1 join tableC on tableb.col1 = tableC.col1

The tables have this data

Table A               Table B                    Table C
id   col1               id     col1                 id     col1
1   10                 1     10                      1      10
2   10                 2     10                      2      10

Does the following happen

a)mysql finds row 1 in table a and joins it to row 1 in table b AND then joins to row 1 in column C. I will describe this row as A(1), B(1), C(1)
B) this row is output
c) mysql looks in table c for anything else that will join onto the already joined rows from table A and B
D) row 2 in table c will join
e) row is output. I will describe this row as A(1), B(1), C(2)
f) no more rows in table C will join so mysql backtracks to table b
g) row 2 in table b will join row 1 in table a
h) mysql moves to table c and joins row1 - A(1), B(2), C(1) and then row 2 - A(1), B(2), C(2)
i) no more rows in table B will join so mysql backtrack to table A and moves to row 2
j) the next rows output are A(2), B(1), C(1) - A(2), B(1), C(2) - A(2), B(2), C(1)  - A(2), B(2), C(2)

thanks in advance
andrea
0
Comment
Question by:andieje
  • 10
  • 7
17 Comments
 
LVL 20

Expert Comment

by:ChristoferDutz
Comment Utility
It seems that MySQL seems to evaluate from the outside in ... so the joins are done exaclty the other way around.

I used this query just out of interest:
Select A.id, B.id, C.id from A join B on A.col1 = B.col1 join C on B.col1 = C.col1

And the result was the following:
1, 1, 1
2, 1, 1
1, 2, 1
2, 2, 1
1, 1, 2
2, 1, 2
1, 2, 2
2, 2, 2

0
 

Author Comment

by:andieje
Comment Utility
I don't understand then. How does the result you obtained match with this statement

MySQL resolves all joins using a single-sweep multi-join method. This means that MySQL reads a row from the first table, and then finds a matching row in the second table, the third table, and so on. When all tables are processed, MySQL outputs the selected columns and backtracks through the table list until a table is found for which there are more matching rows. The next row is read from this table and the process continues with the next table."

I must be interpreting this statement incorrectly.
0
 

Author Comment

by:andieje
Comment Utility
I think I understand what is happening but to me that statement from the mysql manual doesnt describe what is happening intuitively.

This is what i think happens then

a) database finds first row in table a that can join to first row in table b that can then join to first row in table c. This is 1,1,1
b)Now mysql keeps the value of 1 from table C and table b and tries to find any more values from table a that can join. This gives 2,1,1
c) No more rows in table A will join to x,1,1 so mysql keeps the value from table C and moves down to the next one in table b. It tries to find rows in table a that match giving 1,2,1 and 2,2,1.
d) no mores rows in table a match. there are also no more rows in table b so mysql moves down to the next row in table c which gives x,x,2, Mysql tries to find a row in table b that matches, Row 1 matches to gives x,1,2. Mysql mow looks through all of table a to find rows that match gives 1,1,2 and and 2,1,2.
e) now no more rows in table a match so mysql moves down to the next row in table b giving x,2,2. Mysql now looks all through table a to find any matching rows to give 1,2,2 and 2,2,2

Like you said it seems to match outside in

0
 

Author Comment

by:andieje
Comment Utility
Me again

I'm really confused now because the order or results returned in your query does not seem to tally with how joins are carried out in this article

http://hackmysql.com/case4
0
 
LVL 20

Expert Comment

by:ChristoferDutz
Comment Utility
Well you assume the "first" table is the one comming first in the sql query string. From the result above, I'd assume that MySQL processes from the outside to the inside.
0
 

Author Comment

by:andieje
Comment Utility
i don't really understand and its really important for me to understand as I have to do some query optimisation. Do you mind if I post another questions now I know a little bit more. I have found other people don't tend to answer questions if they already have a couple of comments even if they are unsolved.
many thank
0
 
LVL 20

Accepted Solution

by:
ChristoferDutz earned 500 total points
Comment Utility
Hey .. I think we were on the right track. I just let MySQL explain what it was doing:
EXPLAIN Select A.id, B.id, C.id from A join B on A.col1 = B.col1 join C on B.col1 = C.col1
The result was:
1, 'SIMPLE', 'A', 'ALL', '', '', '', '', 2, ''
1, 'SIMPLE', 'B', 'ALL', '', '', '', '', 2, 'Using where; Using join buffer'
1, 'SIMPLE', 'C', 'ALL', '', '', '', '', 2, 'Using where; Using join buffer'
So it seems MySQL was doing the following:
- taking A completely and putting it into the join buffer
- taking the join buffer and selecting from B using a where statement (our ON fragment) and putting this into the join buffer.
- taking the join buffer and selecting from C using a where statement

Seems exactly the way it is described in the documentation and it fits to the results.
0
 
LVL 20

Expert Comment

by:ChristoferDutz
Comment Utility
So just to make it clearer:

SELECT * FROM A
1
2

SELECT * FROM A LEFT JOIN B ON A.col1 = B.col1
1 1
2 1
1 2
2 2
Which is Join every entry in A with every entry in B (Think we can omit the col-thing)

SELECT A.id, B.id, C.id FROM A JOIN B ON A.col1 = B.col1 JOIN C ON B.col1 = C.col1
1 1 1
2 1 1
1 2 1
2 2 1
1 1 2
2 1 2
1 2 2
2 2 2

Which matches the result in my first post.
0
Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

 

Author Comment

by:andieje
Comment Utility
Hi

I'm still confused. Sorry!

I get the first bit, select all of A and put it in the join buffer. Then the next bit

"taking the join buffer and selecting from B using a where statement (our ON fragment) and putting this into the join buffer."

How does this bit actually happen. It looks to me as if mysql is doing this:

a) get the first row from table B
b) go through the rows in the join buffer one by one to see if they match
c) go to the second row in table B
d) go through the rows in the join buffer one by one to see if they match.

This would give the output you listed.

Is this what happens?
0
 
LVL 20

Expert Comment

by:ChristoferDutz
Comment Utility
Well I'd think of it like this (just mindjogging)

- Take A and put it in a temporary table.
- Use this temporary table and join this with B and put the result in a new temporary table.
- Use this second temporary table and join this with C.

Now just call this temporary table "join buffer" ... ther you are.
This does respect the MySQL documentation and produce the output listed above.
0
 

Author Comment

by:andieje
Comment Utility
Hi

Thanks for your last answer. However I need to understand the 'join' bit in more detail. I understand that you take table A and put it in the join buffer or a temporary table and then join this to table B. But how is the join actually carried out. Specifically i want to know how mysql finds rows in table A that match rows in table B. As i said in my last comment, does it take the first row in table B and then look through all of table A for matches and then move on to the second row. Or it could do this the opposite way round (i.e. take the rows in table A one by one). Or it could do something completely different.

Many thanks
0
 
LVL 20

Expert Comment

by:ChristoferDutz
Comment Utility
Well if it works the way the MySQL documentation describes it, it takes a row from the join-buffer (in the first step containing only A) and checks it there are any rows in B where the "ON" part of the join matches. It then does a join with these rows and outputs them into a new joinbuffer. Then it takes the next row of the joinbuffer and does the same for this.

(Usually I you reduce the amount of values returned by the join by using "WHERE" the query optimizer should automatically propagate the filter to the join operation as well ... so not only the ON part is used for filtering, but also the part of the WHERE that dealy only with records of B)
0
 

Author Comment

by:andieje
Comment Utility
I am sorry for being a real pain here but if it does what you say above, it wouldn't return the rows in the same order as returned by the query would it?
0
 

Author Comment

by:andieje
Comment Utility
Sorry, I'll try again as that wasn't very clear

If it did what you say above wouldn't you get

SELECT * FROM A
1
2

SELECT * FROM A LEFT JOIN B ON A.col1 = B.col1
1 1
1 2
2 1
2  2

This is different than a comment you gave earlier in this thread. If it performs the join like this you wouldn't get the rows output in the same order as returned by mysql. This is why I am getting confused I think

0
 

Author Comment

by:andieje
Comment Utility
Hi
If this question is abandoned I will open another question and try again if you don't mind. I have little change of attracting new attention to an old question with 14 posts.
thanks
0
 
LVL 20

Expert Comment

by:ChristoferDutz
Comment Utility
Sorry for the long delay ... I was in Project finalisation stress and couldn't find the time to look in here. I'd suggest to reopen your question. I promise not to participate in the discussion :-)

I would be very interested though, what you need this information for anyway. You said, that you were thinking of query optimisation. What are you actually trying to optimize? Maybe this yould attract more attention, because I think there are more practical DB professionals here than theoretical DB therory guys.
0
 

Author Closing Comment

by:andieje
Comment Utility
many thanks
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Join & Write a Comment

Foreword In the years since this article was written, numerous hacking attacks have targeted password-protected web sites.  The storage of client passwords has become a subject of much discussion, some of it useful and some of it misguided.  Of cou…
A lot of articles have been written on splitting mysqldump and grabbing the required tables. A long while back, when Shlomi (http://code.openark.org/blog/mysql/on-restoring-a-single-table-from-mysqldump) had suggested a “sed” way, I actually shell …
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now