ducky801
asked on
MySQL Query with Inner Join to small table suddenly takes FOREVER
Below are 2 queries. They're exactly the same except for the inner join to the t_lob table (which has 14 rows in it and an index on account and skill). The query with the join used to run in a matter of seconds before I upgraded to MySQL v5.6 (from 5.5). Now it crawls.
Why? What can I do to fix it?
Here are the explain results:
Query with Join:
id,select_type,table,type, possible_k eys,key,ke y_len,ref, rows,Extra
1,PRIMARY,<derived2>,ALL,N ULL,NULL,N ULL,NULL,6 066,"Using temporary; Using filesort"
2,DERIVED,l,ref,"Account,S kill",Acco unt,11,con st,6,"Usin g index condition; Using where"
2,DERIVED,t,ref,"StartDate ,Skill,Acc ount",Skil l,103,att. l.Skill,10 11,"Using where"
Query with Join Commented out:
id,select_type,table,type, possible_k eys,key,ke y_len,ref, rows,Extra
1,PRIMARY,<derived2>,ALL,N ULL,NULL,N ULL,NULL,4 45155,"Usi ng temporary; Using filesort"
2,DERIVED,t,range,"StartDa te,Account ",StartDat e,4,NULL,5 93540,"Usi ng index condition; Using where"
.....So, it seems to me that the inclusion of the inner join causes fewer rows to be evaluated. However, as I've explained it actually takes longer. Not even sure if it will complete. I've let it go for about 30 min before killing it. The version without the join takes 10.8 seconds
To clarify. t_transcripts has about 10M rows in it. t_lob has 14.
select distinct
#lob,
URL_Cleaned, count(URL_Cleaned) as URL_Count, sum(convcounter) as Conversions
from (select
t.account,
#l.lob,
#l.sortseq,
t.sessionid,
t.skill,
t.startdate,
t.chat_button_name,
t.chatreferer as URL,
substring_index(substring_ index(subs tring_inde x(t.chatre ferer, '://',-1),'#', 1), '?',1) as URL_Cleaned,
if(isnull(conversion), 0, 1) as convcounter
from
t_transcripts t
#inner join t_lob l
#on t.account = l.account
#and t.skill = l.skill
where
t.account = '12345678'
and t.startdate between '2014-02-12' and '2014-02-28'
)q
group by
#lob,
url_cleaned
-------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -
SLOW QUERY (inner join is not commented out):
select distinct
#lob,
URL_Cleaned, count(URL_Cleaned) as URL_Count, sum(convcounter) as Conversions
from (select
t.account,
#l.lob,
#l.sortseq,
t.sessionid,
t.skill,
t.startdate,
t.chat_button_name,
t.chatreferer as URL,
substring_index(substring_ index(subs tring_inde x(t.chatre ferer, '://',-1),'#', 1), '?',1) as URL_Cleaned,
if(isnull(conversion), 0, 1) as convcounter
from
t_transcripts t
inner join t_lob l
on t.account = l.account
and t.skill = l.skill
where
t.account = '12345678'
and t.startdate between '2014-02-12' and '2014-02-28'
)q
group by
#lob,
url_cleaned
Why? What can I do to fix it?
Here are the explain results:
Query with Join:
id,select_type,table,type,
1,PRIMARY,<derived2>,ALL,N
2,DERIVED,l,ref,"Account,S
2,DERIVED,t,ref,"StartDate
Query with Join Commented out:
id,select_type,table,type,
1,PRIMARY,<derived2>,ALL,N
2,DERIVED,t,range,"StartDa
.....So, it seems to me that the inclusion of the inner join causes fewer rows to be evaluated. However, as I've explained it actually takes longer. Not even sure if it will complete. I've let it go for about 30 min before killing it. The version without the join takes 10.8 seconds
To clarify. t_transcripts has about 10M rows in it. t_lob has 14.
select distinct
#lob,
URL_Cleaned, count(URL_Cleaned) as URL_Count, sum(convcounter) as Conversions
from (select
t.account,
#l.lob,
#l.sortseq,
t.sessionid,
t.skill,
t.startdate,
t.chat_button_name,
t.chatreferer as URL,
substring_index(substring_
if(isnull(conversion), 0, 1) as convcounter
from
t_transcripts t
#inner join t_lob l
#on t.account = l.account
#and t.skill = l.skill
where
t.account = '12345678'
and t.startdate between '2014-02-12' and '2014-02-28'
)q
group by
#lob,
url_cleaned
--------------------------
SLOW QUERY (inner join is not commented out):
select distinct
#lob,
URL_Cleaned, count(URL_Cleaned) as URL_Count, sum(convcounter) as Conversions
from (select
t.account,
#l.lob,
#l.sortseq,
t.sessionid,
t.skill,
t.startdate,
t.chat_button_name,
t.chatreferer as URL,
substring_index(substring_
if(isnull(conversion), 0, 1) as convcounter
from
t_transcripts t
inner join t_lob l
on t.account = l.account
and t.skill = l.skill
where
t.account = '12345678'
and t.startdate between '2014-02-12' and '2014-02-28'
)q
group by
#lob,
url_cleaned
do both tables have indexes on account & skill?
ASKER
Yea. Muscle memory on the distinct. It flows from the fingertips.....
Yes. Both tables have and index on Account and Skill.
Yes. Both tables have and index on Account and Skill.
ASKER
One other thing to note-
I'm also having this problem:
http://www.experts-exchang e.com/OS/M icrosoft_O perating_S ystems/Win dows/Q_283 82342.html
I read somewhere today that long queries like the one i'm wresting with could be related to running out of space in your tmpdir. Not sure if these could be related or how to go about testing that though.
I'm also having this problem:
http://www.experts-exchang
I read somewhere today that long queries like the one i'm wresting with could be related to running out of space in your tmpdir. Not sure if these could be related or how to go about testing that though.
>> "distinct. It flows from the fingertips"
it most definitely should not flow, it should only ever be used when it is absolutely needed
I haven't more specific ideas on the actual question I'm afraid. Are your statistics current? Do you need to analyse those tables perhaps?
see:
https://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html#innodb-persistent-stats
(assuming innob)
it most definitely should not flow, it should only ever be used when it is absolutely needed
I haven't more specific ideas on the actual question I'm afraid. Are your statistics current? Do you need to analyse those tables perhaps?
see:
https://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html#innodb-persistent-stats
(assuming innob)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I found this answer on a separate message board. Other expert comments on this post didn't help the query to run faster
there is absolutely no point in asking for distinct & group by...
drop use of distinct in both of your queries
...
group by (i.e. make a unique list of) X
i.e. you making a unique list twice