Solved

How to Ensure Index Use in MySQL Join Query

Posted on 2008-09-29
6
1,991 Views
Last Modified: 2013-12-12
Using MySQL 5.0.45

In the MySQL manual page on left join optimization (http://dev.mysql.com/doc/refman/5.0/en/left-join-optimization.html), it says that in the query:

SELECT *
  FROM a JOIN b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
  WHERE b.key=d.key;

MySQL has to perform a full table scan because of the read order forced by left join.

I have a similar query, except that
1) All joins are explicitly left joins
2) All join conditions are given

It looks something like this:

SELECT a.*, b.*, c.*, d.*
FROM a LEFT JOIN b
ON a.primary_key = b.primary_key
LEFT JOIN c
ON a.primary_key = c.primary_key
LEFT JOIN d
ON a.primary_key = d.primary_key
WHERE b.other_key = other_key_value;

The way I think/hope MySQL does this is to join based on what is in table a using the primary key indexes in a, b, c, and d, and then narrow that result set based on b.other_key, with no full table scans.

Am I doing what I want, or am I full table scanning table b?

Thank you for your help!
0
Comment
Question by:iSac1984
  • 4
  • 2
6 Comments
 

Author Comment

by:iSac1984
ID: 22599994
I should have added:

If I am in fact full table scanning, is there any solution besides putting the table in the where clause (table b) first? It would be possible for me to do this, but I'm guessing it would double to triple the size of b.

Thanks again!
0
 
LVL 5

Expert Comment

by:NutrientMS
ID: 22600220
Hi,

Because your WHERE clause references: WHERE b.primary_key = other value, you should reverse your joins as per a snippet from the MySQL page.

I'm not an expert at join optimisation exactly, but that is what I gather from the MySQL page. You could always time it one way, and then time it the other.
The join optimizer calculates the order in which tables should be joined. The table read order forced by LEFT JOIN or STRAIGHT_JOIN helps the join optimizer do its work much more quickly, because there are fewer table permutations to check. Note that this means that if you do a query of the following type, MySQL does a full scan on b because the LEFT JOIN forces it to be read before d: 
SELECT *
  FROM a JOIN b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
  WHERE b.key=d.key;
The fix in this case is reverse the order in which a and b are listed in the FROM clause: 
SELECT *
  FROM b JOIN a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
  WHERE b.key=d.key;

Open in new window

0
 

Author Comment

by:iSac1984
ID: 22600289
Thanks for the input Nutrient!

I'm thinking/hoping that that particular blurb on the manual page is for optimizing the JOIN CONDITION for joining, as opposed to the WHERE clause for result set narrowing. In other words, the first query on the manual page seems to misguidedly use its WHERE clause as a join condition, instead of for further narrowing after joining, which is what I'm interested in.

Also, my the index on table b used in my WHERE clause is not the primary key.

Does that change anything!

iSac
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 5

Accepted Solution

by:
NutrientMS earned 400 total points
ID: 22600665
I think what the JOINS have to do is:

Read A
Join table B on the select keys
Join Table C on the select keys
Join Table D on the select keys
Read B so it can perform the WHERE clause with Table D

So I think if you reverse the table A and Table B, it has to read B anyway so therefore it only reads it once.

It doesn't matter that the index on table b isn't your primary key, as long as it is an index (for performance purposes)

Cheers
0
 

Author Comment

by:iSac1984
ID: 22601005
I'm close to understanding. Assuming that I keep the query the way it is, MySQL does this:

1) Read A using the primary key, join ALL rows in b, c, and d that match based on said primary key

2) Separately, read B using the location index as specified in the WHERE constraint

3) Find and return the intersection of the two result sets

I'm sorry if I'm being a little dense on this. I know altering table b to go first would solve all problems, but doing so would increase the size of b by a lot, at least twice, if not more.

I guess I ultimately don't mind too much if two reads have to be performed, provided they're performed using the appropriate indexes, and not done with a full table scan.
0
 

Author Comment

by:iSac1984
ID: 22601349
Also, (not sure why I didn't put this before) I had MySQL explain a similar query (see code snippet). And it gave me these results (typing manually because the results come back out of alignment)
 
id   select_type   table   type      possible keys            key        key len   ref                rows   extra
1    simple           t1        all          primary                     null         null         null               4
1    simple           t2        ref         member, location     member  4            t1.memberid  1          Using Where
1    simple           t3        eq_ref   primary                    primary   4           t1.memberid   1

I'm working on deciphering what this means, but I thought I should provide it.

Hope that helps, thanks again.
mysql> explain
    -> select t1.*, t2.*, t3.*
    -> from
    -> a t1 left join b t2
    -> on t1.memberid = t2.memberid
    -> left join c t3
    -> on t1.memberid = t3.memberid
    -> where t2.latitude between 40 and 80
    -> and t2.longitude between -40 and -80;

Open in new window

0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction Since I wrote the original article about Handling Date and Time in PHP and MySQL (http://www.experts-exchange.com/articles/201/Handling-Date-and-Time-in-PHP-and-MySQL.html) several years ago, it seemed like now was a good time to updat…
When table data gets too large to manage or queries take too long to execute the solution is often to buy bigger hardware or assign more CPUs and memory resources to the machine to solve the problem. However, the best, cheapest and most effective so…
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

856 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question