Link to home
Start Free TrialLog in
Avatar of andieje
andieje

asked on

understanding single sweep multi join

Hi

The MySQL manual says the following about the singl sweep multi join method.

"MySQL resolves all joins using a single-sweep multi-join method. This means that MySQL reads a row from the first table, and then finds a matching row in the second table, the third table, and so on. When all tables are processed, MySQL outputs the selected columns and backtracks through the table list until a table is found for which there are more matching rows. The next row is read from this table and the process continues with the next table."

I would like to check I am understanding this. Take 3 tables, Table A, B and C all of which have 2 columns ID and col1. Then take the following SQL statement

Select * from table A join tableB on tableA.col1 = TableB.Col1 join tableC on tableb.col1 = tableC.col1

The tables have this data

Table A               Table B                    Table C
id   col1               id     col1                 id     col1
1   10                 1     10                      1      10
2   10                 2     10                      2      10

Does the following happen

a)mysql finds row 1 in table a and joins it to row 1 in table b AND then joins to row 1 in column C. I will describe this row as A(1), B(1), C(1)
B) this row is output
c) mysql looks in table c for anything else that will join onto the already joined rows from table A and B
D) row 2 in table c will join
e) row is output. I will describe this row as A(1), B(1), C(2)
f) no more rows in table C will join so mysql backtracks to table b
g) row 2 in table b will join row 1 in table a
h) mysql moves to table c and joins row1 - A(1), B(2), C(1) and then row 2 - A(1), B(2), C(2)
i) no more rows in table B will join so mysql backtrack to table A and moves to row 2
j) the next rows output are A(2), B(1), C(1) - A(2), B(1), C(2) - A(2), B(2), C(1)  - A(2), B(2), C(2)

thanks in advance
andrea
Avatar of ChristoferDutz
ChristoferDutz
Flag of Germany image

It seems that MySQL seems to evaluate from the outside in ... so the joins are done exaclty the other way around.

I used this query just out of interest:
Select A.id, B.id, C.id from A join B on A.col1 = B.col1 join C on B.col1 = C.col1

And the result was the following:
1, 1, 1
2, 1, 1
1, 2, 1
2, 2, 1
1, 1, 2
2, 1, 2
1, 2, 2
2, 2, 2

Avatar of andieje
andieje

ASKER

I don't understand then. How does the result you obtained match with this statement

MySQL resolves all joins using a single-sweep multi-join method. This means that MySQL reads a row from the first table, and then finds a matching row in the second table, the third table, and so on. When all tables are processed, MySQL outputs the selected columns and backtracks through the table list until a table is found for which there are more matching rows. The next row is read from this table and the process continues with the next table."

I must be interpreting this statement incorrectly.
Avatar of andieje

ASKER

I think I understand what is happening but to me that statement from the mysql manual doesnt describe what is happening intuitively.

This is what i think happens then

a) database finds first row in table a that can join to first row in table b that can then join to first row in table c. This is 1,1,1
b)Now mysql keeps the value of 1 from table C and table b and tries to find any more values from table a that can join. This gives 2,1,1
c) No more rows in table A will join to x,1,1 so mysql keeps the value from table C and moves down to the next one in table b. It tries to find rows in table a that match giving 1,2,1 and 2,2,1.
d) no mores rows in table a match. there are also no more rows in table b so mysql moves down to the next row in table c which gives x,x,2, Mysql tries to find a row in table b that matches, Row 1 matches to gives x,1,2. Mysql mow looks through all of table a to find rows that match gives 1,1,2 and and 2,1,2.
e) now no more rows in table a match so mysql moves down to the next row in table b giving x,2,2. Mysql now looks all through table a to find any matching rows to give 1,2,2 and 2,2,2

Like you said it seems to match outside in

Avatar of andieje

ASKER

Me again

I'm really confused now because the order or results returned in your query does not seem to tally with how joins are carried out in this article

http://hackmysql.com/case4
Well you assume the "first" table is the one comming first in the sql query string. From the result above, I'd assume that MySQL processes from the outside to the inside.
Avatar of andieje

ASKER

i don't really understand and its really important for me to understand as I have to do some query optimisation. Do you mind if I post another questions now I know a little bit more. I have found other people don't tend to answer questions if they already have a couple of comments even if they are unsolved.
many thank
ASKER CERTIFIED SOLUTION
Avatar of ChristoferDutz
ChristoferDutz
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
So just to make it clearer:

SELECT * FROM A
1
2

SELECT * FROM A LEFT JOIN B ON A.col1 = B.col1
1 1
2 1
1 2
2 2
Which is Join every entry in A with every entry in B (Think we can omit the col-thing)

SELECT A.id, B.id, C.id FROM A JOIN B ON A.col1 = B.col1 JOIN C ON B.col1 = C.col1
1 1 1
2 1 1
1 2 1
2 2 1
1 1 2
2 1 2
1 2 2
2 2 2

Which matches the result in my first post.
Avatar of andieje

ASKER

Hi

I'm still confused. Sorry!

I get the first bit, select all of A and put it in the join buffer. Then the next bit

"taking the join buffer and selecting from B using a where statement (our ON fragment) and putting this into the join buffer."

How does this bit actually happen. It looks to me as if mysql is doing this:

a) get the first row from table B
b) go through the rows in the join buffer one by one to see if they match
c) go to the second row in table B
d) go through the rows in the join buffer one by one to see if they match.

This would give the output you listed.

Is this what happens?
Well I'd think of it like this (just mindjogging)

- Take A and put it in a temporary table.
- Use this temporary table and join this with B and put the result in a new temporary table.
- Use this second temporary table and join this with C.

Now just call this temporary table "join buffer" ... ther you are.
This does respect the MySQL documentation and produce the output listed above.
Avatar of andieje

ASKER

Hi

Thanks for your last answer. However I need to understand the 'join' bit in more detail. I understand that you take table A and put it in the join buffer or a temporary table and then join this to table B. But how is the join actually carried out. Specifically i want to know how mysql finds rows in table A that match rows in table B. As i said in my last comment, does it take the first row in table B and then look through all of table A for matches and then move on to the second row. Or it could do this the opposite way round (i.e. take the rows in table A one by one). Or it could do something completely different.

Many thanks
Well if it works the way the MySQL documentation describes it, it takes a row from the join-buffer (in the first step containing only A) and checks it there are any rows in B where the "ON" part of the join matches. It then does a join with these rows and outputs them into a new joinbuffer. Then it takes the next row of the joinbuffer and does the same for this.

(Usually I you reduce the amount of values returned by the join by using "WHERE" the query optimizer should automatically propagate the filter to the join operation as well ... so not only the ON part is used for filtering, but also the part of the WHERE that dealy only with records of B)
Avatar of andieje

ASKER

I am sorry for being a real pain here but if it does what you say above, it wouldn't return the rows in the same order as returned by the query would it?
Avatar of andieje

ASKER

Sorry, I'll try again as that wasn't very clear

If it did what you say above wouldn't you get

SELECT * FROM A
1
2

SELECT * FROM A LEFT JOIN B ON A.col1 = B.col1
1 1
1 2
2 1
2  2

This is different than a comment you gave earlier in this thread. If it performs the join like this you wouldn't get the rows output in the same order as returned by mysql. This is why I am getting confused I think

Avatar of andieje

ASKER

Hi
If this question is abandoned I will open another question and try again if you don't mind. I have little change of attracting new attention to an old question with 14 posts.
thanks
Sorry for the long delay ... I was in Project finalisation stress and couldn't find the time to look in here. I'd suggest to reopen your question. I promise not to participate in the discussion :-)

I would be very interested though, what you need this information for anyway. You said, that you were thinking of query optimisation. What are you actually trying to optimize? Maybe this yould attract more attention, because I think there are more practical DB professionals here than theoretical DB therory guys.
Avatar of andieje

ASKER

many thanks