Link to home
Start Free TrialLog in
Avatar of tantormedia
tantormediaFlag for United States of America

asked on

LEFT OUTER JOIN VS. RIGHT OUTER JOIN

Dear experts,

I have a query:
SELECT ...
MAX(CASE WHEN r.r_type = 'Audio' THEN r.Awards END) AudioReviews,
MAX(CASE WHEN r.r_type = 'Print' THEN r.Awards END) PrintReviews,
...
FROM BookList b
...
LEFT OUTER JOIN
(SELECT BookStem, r_type, GROUP_CONCAT(DISTINCT CASE WHEN Award IS NOT NULL THEN
CONCAT('<a href="/manage/review.asp?review=', CAST(ID AS CHAR), '" target="_blank">', Award, '</a>')
ELSE CONCAT('<a href="/manage/review.asp?review=', CAST(ID AS CHAR), '" target="_blank">Review</a>') END SEPARATOR '<br />') Awards
FROM Reviews
GROUP BY BookStem, r_type) r
ON b.BookStem = r.BookStem

That outer join part makes my query very slow. I replace the LEFT OUTER JOIN with RIGHT OUTER JOIN and swapped tables accordingly:
SELECT ...
MAX(CASE WHEN r.r_type = 'Audio' THEN r.Awards END) AudioReviews,
MAX(CASE WHEN r.r_type = 'Print' THEN r.Awards END) PrintReviews,
...
FROM
(SELECT BookStem, r_type, GROUP_CONCAT(DISTINCT CASE WHEN Award IS NOT NULL THEN
CONCAT('<a href="/manage/review.asp?review=', CAST(ID AS CHAR), '" target="_blank">', Award, '</a>')
ELSE CONCAT('<a href="/manage/review.asp?review=', CAST(ID AS CHAR), '" target="_blank">Review</a>') END SEPARATOR '<br />') Awards
FROM Reviews
GROUP BY BookStem, r_type) r
RIGHT OUTER JOIN
BookList b
ON b.BookStem = r.BookStem
...
To my surprise, now the query runs 3 times faster, though the result seems to be the same. Could you please explain this to me? I always thought it there should be no difference.
Also, maybe you could give me any suggestion how to improve the query further?
Thanks.
Avatar of Kevin Cross
Kevin Cross
Flag of United States of America image

Hmm.

That is very interesting and, yes, the results and performance of query should be the same.  To see what is going on, try to run EXPLAIN on both queries and see if there is a different plan developed based on the RIGHT versus LEFT OUTER JOIN.

Here is a nice reference for EXPLAIN and MySQL query performance in general:
https://www.experts-exchange.com/Database/MySQL/A_1250-3-Ways-to-Speed-Up-MySQL.html

As for further tuning, the results of EXPLAIN should help as well as the reference above.  Look for index issues and try to simplify when possible.  For example, instead of the CASE in the GROUP_CONCAT, try using COALESCE -- if nothing else, should be easier to maintain code.

...
(SELECT BookStem, r_type, GROUP_CONCAT(DISTINCT CONCAT('<a href="/manage/review.asp?review=', CAST(ID AS CHAR), '" target="_blank">', COALESCE(Award, 'Review'), '</a>') SEPARATOR '<br />') Awards
FROM Reviews
GROUP BY BookStem, r_type) r
...

Open in new window


Hope that helps!
Avatar of tantormedia

ASKER

Thank you for your answer. I did try EXPLAIN.
Here is the result for the LOJ:
1, 'PRIMARY', 'b', 'range', 'Index_PublicationDate', 'Index_PublicationDate', '9', '', 1036, 'Using where; Using temporary; Using filesort'
3, 'DERIVED', 'Reviews', 'ALL', '', '', '', '', 5859, 'Using filesort'
1, 'PRIMARY', '<derived3>', 'ALL', '', '', '', '', 2117, ''

For the ROJ:
1, 'PRIMARY', 'b', 'ALL', 'Index_PublicationDate', '', '', '', 2325, 'Using where; Using temporary; Using filesort'
2, 'DERIVED', 'Reviews', 'ALL', '', '', '', '', 5859, 'Using filesort'
1, 'PRIMARY', '<derived2>', 'ALL', '', '', '', '', 2117, ''

It seems like RIGHT OUTER JOIN should be slower, but it is faster...
ASKER CERTIFIED SOLUTION
Avatar of Kevin Cross
Kevin Cross
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The first query ran in 0.0242, the second in 0.0021. But they are doing different things. The first query returns 2,117 rows, the second returns 1,483. Still, I tried to change my query using the second subquery, but the result was about the same as before.
Yes, because you were filtering out the non Audio and Print reviews in the other query before by using the conditional MAX(...) statements.  If you truly only need those two r_type values, then the second query limits the resultset to that and thus takes advantage of any indexing on that column versus doing a full table scan on Reviews.  Even if it does, your later query has 600+ less rows it has to scan/deal with.

If the outer query retains its speed, then the issue is something on the outside.  Can you please share the new full query with the WHERE clause especially.
OK, here is the new query:
SELECT b.BookStem, au.Submitted, bc.Subject AS BookGenre, au.AudiesGenre, DATE_FORMAT(au.AudiesDate, '%m/%d/%y') AS AudiesDate, b.Title,
      GROUP_CONCAT(DISTINCT CONCAT(a.LastName, ', ', a.FirstName) SEPARATOR '<br />') as Author,
      GROUP_CONCAT(DISTINCT CONCAT(n.LastName, ', ', n.FirstName) SEPARATOR '<br />') as Narrator,
      DATE_FORMAT(b.ShipDate, '%m/%d/%y') AS ShipDate, DATE_FORMAT(b.HCDate, '%m/%d/%y') AS HCDate,
      (CASE WHEN b.Released = 'x' THEN 'Released'
      WHEN nl.TextSent IS NULL THEN 'Unassigned'
      WHEN CURDATE() >= nl.`From` AND CURDATE() <= nl.`To` THEN 'Recording'
      WHEN CURDATE() < nl.From THEN 'Not Started'
      WHEN nl.AudioReceive IS NOT NULL THEN '1st Audio'
      WHEN nl.PickupSent IS NOT NULL THEN 'Waiting For Pickups'
      WHEN nl.PickupReceived IS NOT NULL THEN 'Final Audio'
      END) AS `Status`,
      ar.ProoferComments,
      r.AudioReviews,
      r.PrintReviews,
      b.DL_Rank, b.HC_Rank
      FROM BookList b
      LEFT OUTER JOIN Audies au ON b.BookStem = au.BookStem
      LEFT OUTER JOIN (BisacLinks bl JOIN BisacCodes bc ON bl.BisacCode = bc.BisacCode)
            ON b.BookStem = bl.BookStem
      LEFT OUTER JOIN (AuthorLinks al JOIN Authors a ON al.AuthorStem = a.AuthorStem)
            ON b.BookStem = al.BookStem
      LEFT OUTER JOIN (NarratorLinks nl JOIN Narrators n ON nl.NarratorStem = n.NarratorStem)
            ON b.BookStem = nl.BookStem
      LEFT OUTER JOIN
        (SELECT BookStem, CASE WHEN COUNT(ID) > 0 THEN '<a href="/manage/AudioList.asp" target="_blank">Comments</a>' ELSE NULL END AS ProoferComments
        FROM AudioReviews GROUP BY BookStem) ar
      ON b.BookStem = ar.BookStem
      LEFT OUTER JOIN
        (SELECT BookStem
     , GROUP_CONCAT(DISTINCT CASE r_type WHEN 'Audio' THEN CONCAT('<a href="/manage/review.asp?review=', CAST(ID AS CHAR), '" target="_blank">', COALESCE(Award, 'Review'), '</a>') END SEPARATOR '<br />') AudioReviews
     , GROUP_CONCAT(DISTINCT CASE r_type WHEN 'Print' THEN CONCAT('<a href="/manage/review.asp?review=', CAST(ID AS CHAR), '" target="_blank">', COALESCE(Award, 'Review'), '</a>') END SEPARATOR '<br />') PrintReviews
FROM Reviews
WHERE r_type IN ('Print', 'Audio')
GROUP BY BookStem) r
      ON b.BookStem = r.BookStem  WHERE b.PublicationDate >= CURDATE() - INTERVAL 12 MONTH GROUP BY b.BookStem  ORDER BY b.BookStem

Open in new window

Okay, real quick the only columns it appear to be aggregated any more are coming from tables 'a' and 'n'.  Try aggregating those in a derived table or view ahead of time also and see if that helps.

Ensure that BookStem has a good index on it in all tables as it appears to be what is joining everything.

Check on index on PublicationDate.
Thanks