Solved

How to Ensure Index Use in MySQL Join Query

Posted on 2008-09-29
6
1,987 Views
Last Modified: 2013-12-12
Using MySQL 5.0.45

In the MySQL manual page on left join optimization (http://dev.mysql.com/doc/refman/5.0/en/left-join-optimization.html), it says that in the query:

SELECT *
  FROM a JOIN b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)
  WHERE b.key=d.key;

MySQL has to perform a full table scan because of the read order forced by left join.

I have a similar query, except that
1) All joins are explicitly left joins
2) All join conditions are given

It looks something like this:

SELECT a.*, b.*, c.*, d.*
FROM a LEFT JOIN b
ON a.primary_key = b.primary_key
LEFT JOIN c
ON a.primary_key = c.primary_key
LEFT JOIN d
ON a.primary_key = d.primary_key
WHERE b.other_key = other_key_value;

The way I think/hope MySQL does this is to join based on what is in table a using the primary key indexes in a, b, c, and d, and then narrow that result set based on b.other_key, with no full table scans.

Am I doing what I want, or am I full table scanning table b?

Thank you for your help!
0
Comment
Question by:iSac1984
  • 4
  • 2
6 Comments
 

Author Comment

by:iSac1984
ID: 22599994
I should have added:

If I am in fact full table scanning, is there any solution besides putting the table in the where clause (table b) first? It would be possible for me to do this, but I'm guessing it would double to triple the size of b.

Thanks again!
0
 
LVL 5

Expert Comment

by:NutrientMS
ID: 22600220
Hi,

Because your WHERE clause references: WHERE b.primary_key = other value, you should reverse your joins as per a snippet from the MySQL page.

I'm not an expert at join optimisation exactly, but that is what I gather from the MySQL page. You could always time it one way, and then time it the other.
The join optimizer calculates the order in which tables should be joined. The table read order forced by LEFT JOIN or STRAIGHT_JOIN helps the join optimizer do its work much more quickly, because there are fewer table permutations to check. Note that this means that if you do a query of the following type, MySQL does a full scan on b because the LEFT JOIN forces it to be read before d: 

SELECT *

  FROM a JOIN b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)

  WHERE b.key=d.key;

The fix in this case is reverse the order in which a and b are listed in the FROM clause: 

SELECT *

  FROM b JOIN a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d ON (d.key=a.key)

  WHERE b.key=d.key;

Open in new window

0
 

Author Comment

by:iSac1984
ID: 22600289
Thanks for the input Nutrient!

I'm thinking/hoping that that particular blurb on the manual page is for optimizing the JOIN CONDITION for joining, as opposed to the WHERE clause for result set narrowing. In other words, the first query on the manual page seems to misguidedly use its WHERE clause as a join condition, instead of for further narrowing after joining, which is what I'm interested in.

Also, my the index on table b used in my WHERE clause is not the primary key.

Does that change anything!

iSac
0
Free Gift Card with Acronis Backup Purchase!

Backup any data in any location: local and remote systems, physical and virtual servers, private and public clouds, Macs and PCs, tablets and mobile devices, & more! For limited time only, buy any Acronis backup products and get a FREE Amazon/Best Buy gift card worth up to $200!

 
LVL 5

Accepted Solution

by:
NutrientMS earned 400 total points
ID: 22600665
I think what the JOINS have to do is:

Read A
Join table B on the select keys
Join Table C on the select keys
Join Table D on the select keys
Read B so it can perform the WHERE clause with Table D

So I think if you reverse the table A and Table B, it has to read B anyway so therefore it only reads it once.

It doesn't matter that the index on table b isn't your primary key, as long as it is an index (for performance purposes)

Cheers
0
 

Author Comment

by:iSac1984
ID: 22601005
I'm close to understanding. Assuming that I keep the query the way it is, MySQL does this:

1) Read A using the primary key, join ALL rows in b, c, and d that match based on said primary key

2) Separately, read B using the location index as specified in the WHERE constraint

3) Find and return the intersection of the two result sets

I'm sorry if I'm being a little dense on this. I know altering table b to go first would solve all problems, but doing so would increase the size of b by a lot, at least twice, if not more.

I guess I ultimately don't mind too much if two reads have to be performed, provided they're performed using the appropriate indexes, and not done with a full table scan.
0
 

Author Comment

by:iSac1984
ID: 22601349
Also, (not sure why I didn't put this before) I had MySQL explain a similar query (see code snippet). And it gave me these results (typing manually because the results come back out of alignment)
 
id   select_type   table   type      possible keys            key        key len   ref                rows   extra
1    simple           t1        all          primary                     null         null         null               4
1    simple           t2        ref         member, location     member  4            t1.memberid  1          Using Where
1    simple           t3        eq_ref   primary                    primary   4           t1.memberid   1

I'm working on deciphering what this means, but I thought I should provide it.

Hope that helps, thanks again.
mysql> explain

    -> select t1.*, t2.*, t3.*

    -> from

    -> a t1 left join b t2

    -> on t1.memberid = t2.memberid

    -> left join c t3

    -> on t1.memberid = t3.memberid

    -> where t2.latitude between 40 and 80

    -> and t2.longitude between -40 and -80;

Open in new window

0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
Introduction Since I wrote the original article about Handling Date and Time in PHP and MySQL (http://www.experts-exchange.com/articles/201/Handling-Date-and-Time-in-PHP-and-MySQL.html) several years ago, it seemed like now was a good time to updat…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This tutorial will teach you the core code needed to finalize the addition of a watermark to your image. The viewer will use a small PHP class to learn and create a watermark.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now