• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2006
  • Last Modified:

How to Ensure Index Use in MySQL Join Query

Using MySQL 5.0.45

In the MySQL manual page on left join optimization (http://dev.mysql.com/doc/refman/5.0/en/left-join-optimization.html), it says that in the query:

SELECT *
  FROM a JOIN b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
  WHERE b.key=d.key;

MySQL has to perform a full table scan because of the read order forced by left join.

I have a similar query, except that
1) All joins are explicitly left joins
2) All join conditions are given

It looks something like this:

SELECT a.*, b.*, c.*, d.*
FROM a LEFT JOIN b
ON a.primary_key = b.primary_key
LEFT JOIN c
ON a.primary_key = c.primary_key
LEFT JOIN d
ON a.primary_key = d.primary_key
WHERE b.other_key = other_key_value;

The way I think/hope MySQL does this is to join based on what is in table a using the primary key indexes in a, b, c, and d, and then narrow that result set based on b.other_key, with no full table scans.

Am I doing what I want, or am I full table scanning table b?

Thank you for your help!
0
iSac1984
Asked:
iSac1984
  • 4
  • 2
1 Solution
 
iSac1984Author Commented:
I should have added:

If I am in fact full table scanning, is there any solution besides putting the table in the where clause (table b) first? It would be possible for me to do this, but I'm guessing it would double to triple the size of b.

Thanks again!
0
 
NutrientMSCommented:
Hi,

Because your WHERE clause references: WHERE b.primary_key = other value, you should reverse your joins as per a snippet from the MySQL page.

I'm not an expert at join optimisation exactly, but that is what I gather from the MySQL page. You could always time it one way, and then time it the other.
The join optimizer calculates the order in which tables should be joined. The table read order forced by LEFT JOIN or STRAIGHT_JOIN helps the join optimizer do its work much more quickly, because there are fewer table permutations to check. Note that this means that if you do a query of the following type, MySQL does a full scan on b because the LEFT JOIN forces it to be read before d: 
SELECT *
  FROM a JOIN b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
  WHERE b.key=d.key;
The fix in this case is reverse the order in which a and b are listed in the FROM clause: 
SELECT *
  FROM b JOIN a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
  WHERE b.key=d.key;

Open in new window

0
 
iSac1984Author Commented:
Thanks for the input Nutrient!

I'm thinking/hoping that that particular blurb on the manual page is for optimizing the JOIN CONDITION for joining, as opposed to the WHERE clause for result set narrowing. In other words, the first query on the manual page seems to misguidedly use its WHERE clause as a join condition, instead of for further narrowing after joining, which is what I'm interested in.

Also, my the index on table b used in my WHERE clause is not the primary key.

Does that change anything!

iSac
0
Cloud Class® Course: Amazon Web Services - Basic

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

 
NutrientMSCommented:
I think what the JOINS have to do is:

Read A
Join table B on the select keys
Join Table C on the select keys
Join Table D on the select keys
Read B so it can perform the WHERE clause with Table D

So I think if you reverse the table A and Table B, it has to read B anyway so therefore it only reads it once.

It doesn't matter that the index on table b isn't your primary key, as long as it is an index (for performance purposes)

Cheers
0
 
iSac1984Author Commented:
I'm close to understanding. Assuming that I keep the query the way it is, MySQL does this:

1) Read A using the primary key, join ALL rows in b, c, and d that match based on said primary key

2) Separately, read B using the location index as specified in the WHERE constraint

3) Find and return the intersection of the two result sets

I'm sorry if I'm being a little dense on this. I know altering table b to go first would solve all problems, but doing so would increase the size of b by a lot, at least twice, if not more.

I guess I ultimately don't mind too much if two reads have to be performed, provided they're performed using the appropriate indexes, and not done with a full table scan.
0
 
iSac1984Author Commented:
Also, (not sure why I didn't put this before) I had MySQL explain a similar query (see code snippet). And it gave me these results (typing manually because the results come back out of alignment)
 
id   select_type   table   type      possible keys            key        key len   ref                rows   extra
1    simple           t1        all          primary                     null         null         null               4
1    simple           t2        ref         member, location     member  4            t1.memberid  1          Using Where
1    simple           t3        eq_ref   primary                    primary   4           t1.memberid   1

I'm working on deciphering what this means, but I thought I should provide it.

Hope that helps, thanks again.
mysql> explain
    -> select t1.*, t2.*, t3.*
    -> from
    -> a t1 left join b t2
    -> on t1.memberid = t2.memberid
    -> left join c t3
    -> on t1.memberid = t3.memberid
    -> where t2.latitude between 40 and 80
    -> and t2.longitude between -40 and -80;

Open in new window

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: Amazon Web Services - Basic

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

  • 4
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now