Solved

MySQL sorting optimization - multiple table joins

Posted on 2007-12-03
6
1,230 Views
Last Modified: 2013-12-12
I have n tables that are joined in a select query, all the where conditions are on n-1 tables, and all the columns that can be sorted are on the remaining 1 table.
When doing a join mysql always selects one of the tables with where conditions as the "first" table meaning no index is used for sorting and I always get a tmp table/filesort which, depending on the num of records, can get slow.

From MySQL manual: "You are joining many tables, and the columns in the ORDER BY are not all from the first non-constant table that is used to retrieve rows. (This is the first table in the EXPLAIN output that does not have a const join type.)".

This makes sense to me because by selecting the table with the least amount of rows that match the where criteria as first drasticly reduces the resultset to work with....so forcing the join in reverse order isn't good as an index is used for sorting but the time required to execute is a lot longer because the resultset isn't filtered by the joins first and all the rows are scaned.

My question ...Is there any way to optimize this better... and how?

Open in new window

0
Comment
Question by:Interghost
  • 2
  • 2
  • 2
6 Comments
 
LVL 18

Expert Comment

by:ingwa
ID: 20398006
Use left or right joins and use specific names for the field names, such as select table1.ID, table1.Name, table2.Description instead of select *.  Additionally, by using a left or right join, you are only returning rows from the first table which meet a criteria instead of everything.  So if table one had a few records with blank ID's (which I would certainly hope wouldn't exist...but this is an example), then your results would only contain records with values in table1.ID.  Hope this helps.
0
 

Author Comment

by:Interghost
ID: 20398169
using left/right join means the result set won't get filtered (thats the main idea of using "normal" joins in this case, I need rows present in all tables and normal join with conditions on each table filters the resultset perfectly)

...I think you missed the point of my question, I need help in optimizing sorting
example query:
select t1.* from table1 t1 join table2 t2 on t1.id = t2.id join table3 t3 on t1.id=t3.id ... join tableN on t1.id = tN.id where t2.something = something and t3.something = something ... and tN.something = something  [>>>] order by t1.title [<<<]

...notice no where condition on t1 table, but order by on t1 table
the above query will use indexes for every join / where condition but will do a filesort of the resultset at the end which gets slow as the num of rows increases and thats my problem
0
 
LVL 18

Accepted Solution

by:
ingwa earned 450 total points
ID: 20398188
Ahh, i see what you mean and my apologies.  If your join statements are pretty complex and aren't sorting as quickly as you would expect them to, have you considered the option of creating a temporary table, storing the result in that and then doing your sorting therein?  You may find it improves the result times.
0
Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

 
LVL 50

Expert Comment

by:Steve Bink
ID: 20398334
Have you followed the advice found here?

http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html

I think the item tripping you up is this one:

# The key used to fetch the rows is not the same as the one used in the ORDER BY:

SELECT * FROM t1 WHERE key2=constant ORDER BY key1;

Using inqwa's temporary table strategy may be of some help to you in this regard.
0
 

Author Comment

by:Interghost
ID: 20406962
@ingwa ... hm, yes temp. tables also crossed my mind earlier today, did a quick test with a  create temporary ... select   -  but the gain wasn't as big as I hoped and I didn't have enought time for an entire test... and it was also only create select without another sorted select from the temp table... I'll try making a whole testcase tomorrow and see how it goes.

One more thing I forgot to mention... the query in my case is progressive, ie it starts with N = 2 lets say and then N increases... the bigger the N (number of joined tables / conditions) the smaller the resultset and thus faster query execution cause of fewer items to sort, kinda ironic...more complex query, faster execution :))
I think temp.tables would probably do some help in the first (maybe even second) iteration of N, but after that it would probably become slower and I'm searching for a solution that would work with every iteration...if there is one of course...if not I guess I'll just have to live with the way things are + cache

@routinet: that manual page lists basic single-table examples that don't apply here, correct me if I'm wrong (and yes I have read through it a few times before :))

PS: can someone remind me how do I assign partial points?
0
 
LVL 50

Assisted Solution

by:Steve Bink
Steve Bink earned 50 total points
ID: 20407378
>>> that manual page lists basic single-table examples that don't apply here, correct me if I'm wrong

The examples are basic, true, but the deeper concept holds true for more complex queries as well.  Your query is using keys a, b, and c to select rows, but you are ordering by key d.  In that case, according to the example I posted before, MySQL will use a filesort on the results.

This means the basic problem of the filesort is that you are using selection keys which are not used for ordering.  To work around that limitation, you can insert your query results into a temporary table, then re-select them with the appropriate ORDER BY.  It will certainly be a little slower than if you were able to execute a direct sort on the results, but there will not be much of a hit.  

You can split the points by clicking the "Accept Multiple Solutions" link below any comment.  On the following screens, assign the points as you want, and the site will do the rest.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
This article discusses four methods for overlaying images in a container on a web page
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

910 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now