[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

mysql query hangs system - processlist says copy to tmp table

Posted on 2011-05-03
13
Medium Priority
?
773 Views
Last Modified: 2012-05-11
Hi All,

I have a big problem with a database application that needs to be fixed quite quickly. The problem only occurred this morning, it was fine last week but has over time had some performance issues which were put down to network issues. The system also runs under a VM and the O/S is CentOS 5.5 updated as of December 2010. Updates are restricted  on the server to make sure it does not download dodgy package and compromise the system. The ESX host is running normally, no issues or alarms, moving the system to another host has no effect, moving the datastore also has no effect either.

Basically, the admin function runs the following query, the information is then formatted and put into a web page.

It may be that the queries are not as efficient as they could be or it may also be that the database or kernel needs tuning. There have been no changes to the hardware or the application, the only change is that the data size has increased from 1050 to 1062 records.

Strangely, the test server has no issues.

Any assistance will be greatly appreciated.


select B.booking_ref as 'ref', B.booking_type as btype, B.customer as 'customer',
          D1.location as 'flocation',     F.address1 as 'fad1', F.address2 as 'fad2', F.address3 as 'fad3',
          F.postcode as 'fpc', F.contact_name as 'fcontact', F.contact_phone as 'fphone',
          D2.location as 'tlocation', T.address1 as 'tad1', T.address2 as 'tad2', T.address3 as 'tad3',
          T.postcode as 'tpc', T.contact_name as 'tcontact', T.contact_phone as 'tphone',
          date_format(B.arrival_date, '%d/%m/%Y') as 'due', A.processed as P, A.approved as 'A'
from BOOKING B, FROMADDRESS F, TOADDRESS T, ADMIN A, ADDRESS D1, ADDRESS D2
where B.booking_id = F.booking_id
and B.booking_id = T.booking_id
and B.booking_id = A.booking_id
and F.loc_id = D1.loc_id
and T.loc_id = D2.loc_id
and B.arrival_date >= date_sub(current_date(), interval 1 month)
order by P, B.arrival_date, ref

Open in new window


There are 5 tables details below;
mysql> desc BOOKING;
+-----------------+------------------+------+-----+---------+----------------+
| Field           | Type             | Null | Key | Default | Extra          |
+-----------------+------------------+------+-----+---------+----------------+
| booking_id      | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| booking_ref     | varchar(10)      | NO   |     | NULL    |                |
| booking_type    | varchar(10)      | NO   |     | NULL    |                |
| customer        | varchar(40)      | NO   |     | NULL    |                |
| contract        | varchar(40)      | NO   |     | NULL    |                |
| requester_name  | varchar(40)      | NO   |     | NULL    |                |
| requester_email | varchar(40)      | NO   |     | NULL    |                |
| requester_phone | varchar(40)      | NO   |     | NULL    |                |
| ready_date      | date             | NO   |     | NULL    |                |
| ready_time      | time             | NO   |     | NULL    |                |
| arrival_date    | date             | NO   |     | NULL    |                |
| arrival_time    | time             | NO   |     | NULL    |                |
+-----------------+------------------+------+-----+---------+----------------+
12 rows in set (0.00 sec)

mysql> desc ADMIN;
+---------------+------------------+------+-----+---------+-------+
| Field         | Type             | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+-------+
| booking_id    | int(10) unsigned | NO   |     | NULL    |       |
| other_ref     | varchar(30)      | NO   |     | NULL    |       |
| vehicle_type  | varchar(30)      | NO   |     | NULL    |       |
| driver_name   | varchar(30)      | NO   |     | NULL    |       |
| collect_date  | date             | NO   |     | NULL    |       |
| collect_time  | time             | YES  |     | NULL    |       |
| delivery_date | date             | NO   |     | NULL    |       |
| delivery_time | time             | YES  |     | NULL    |       |
| chargeable    | varchar(6)       | NO   |     | NULL    |       |
| distance      | int(11)          | YES  |     | NULL    |       |
| cost          | float(6,2)       | YES  |     | NULL    |       |
| processed     | varchar(3)       | NO   |     | NULL    |       |
| approved      | varchar(10)      | NO   |     | NULL    |       |
| notes         | mediumtext       | YES  |     | NULL    |       |
| admin_user    | varchar(45)      | NO   |     | NULL    |       |
| cat1          | varchar(45)      | YES  |     | NULL    |       |
| cat2          | varchar(45)      | YES  |     | NULL    |       |
+---------------+------------------+------+-----+---------+-------+
17 rows in set (0.00 sec)

mysql> desc FROMADDRESS;
+---------------+------------------+------+-----+---------+-------+
| Field         | Type             | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+-------+
| booking_id    | int(10) unsigned | NO   |     | NULL    |       |
| loc_id        | int(11)          | YES  |     | NULL    |       |
| address1      | varchar(40)      | NO   |     | NULL    |       |
| address2      | varchar(40)      | NO   |     | NULL    |       |
| address3      | varchar(40)      | NO   |     | NULL    |       |
| postcode      | varchar(10)      | NO   |     | NULL    |       |
| contact_name  | varchar(40)      | NO   |     | NULL    |       |
| contact_phone | varchar(40)      | NO   |     | NULL    |       |
+---------------+------------------+------+-----+---------+-------+
8 rows in set (0.00 sec)

mysql> desc TOADDRESS;
+---------------+------------------+------+-----+---------+-------+
| Field         | Type             | Null | Key | Default | Extra |
+---------------+------------------+------+-----+---------+-------+
| booking_id    | int(10) unsigned | NO   |     | NULL    |       |
| loc_id        | int(11)          | YES  |     | NULL    |       |
| address1      | varchar(40)      | NO   |     | NULL    |       |
| address2      | varchar(40)      | YES  |     | NULL    |       |
| address3      | varchar(40)      | YES  |     | NULL    |       |
| postcode      | varchar(10)      | NO   |     | NULL    |       |
| contact_name  | varchar(40)      | NO   |     | NULL    |       |
| contact_phone | varchar(40)      | NO   |     | NULL    |       |
+---------------+------------------+------+-----+---------+-------+
8 rows in set (0.01 sec)

mysql> desc PACKAGE;
+-------------------+------------------+------+-----+---------+-------+
| Field             | Type             | Null | Key | Default | Extra |
+-------------------+------------------+------+-----+---------+-------+
| booking_id        | int(10) unsigned | NO   |     | NULL    |       |
| package_reference | varchar(1)       | YES  |     | NULL    |       |
| package_type      | mediumtext       | YES  |     | NULL    |       |
| package_quantity  | int(10) unsigned | YES  |     | NULL    |       |
| package_weight    | varchar(10)      | YES  |     | NULL    |       |
+-------------------+------------------+------+-----+---------+-------+
5 rows in set (0.00 sec)

Open in new window

0
Comment
Question by:jools
  • 7
  • 6
13 Comments
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 35517648
Hi.  

Please post the results of:
EXPLAIN
select B.booking_ref as 'ref', B.booking_type as btype, B.customer as 'customer',
          D1.location as 'flocation',     F.address1 as 'fad1', F.address2 as 'fad2', F.address3 as 'fad3',
          F.postcode as 'fpc', F.contact_name as 'fcontact', F.contact_phone as 'fphone',
          D2.location as 'tlocation', T.address1 as 'tad1', T.address2 as 'tad2', T.address3 as 'tad3',
          T.postcode as 'tpc', T.contact_name as 'tcontact', T.contact_phone as 'tphone',
          date_format(B.arrival_date, '%d/%m/%Y') as 'due', A.processed as P, A.approved as 'A'
from BOOKING B, FROMADDRESS F, TOADDRESS T, ADMIN A, ADDRESS D1, ADDRESS D2
where B.booking_id = F.booking_id
and B.booking_id = T.booking_id
and B.booking_id = A.booking_id
and F.loc_id = D1.loc_id
and T.loc_id = D2.loc_id
and B.arrival_date >= date_sub(current_date(), interval 1 month)
order by P, B.arrival_date, ref
;

Open in new window


The likely cause is that you don't have proper indices on tables.  Looks like booking_id is the foreign key link in the subsequent tables to primary key of Bookings table.  Ensure that there is also an index on arrival_date since that is really the only WHERE condition that isn't part of JOIN criteria which I would personally go for ANSI style.

i.e.,
from BOOKING B
-- use a left outer join if potential that FROMADDRESS doesn't exist for specific booking record
join FROMADDRESS F on B.booking_id = F.booking_id
...
where B.arrival_date >= date_sub(current_date(), interval 1 month)...

Kevin
0
 
LVL 19

Author Comment

by:jools
ID: 35517745
Thanks Kevin, I'll get onto this first thing in the morning, hopefully you have a BST time zone so you'll be about too, though I guess if you're posting this "late" then you're likely somewhere over the pond, in which case I'll catch up tomorrow evening... :-)

I've been reading about the indexes and ran an explain when I was in the office though it didnt really mean anything to me at the time.

Cheers

Jools
0
 
LVL 60

Assisted Solution

by:Kevin Cross
Kevin Cross earned 2000 total points
ID: 35517775
Yes, I am in EST, but may be up and about.  In case, here is a nice reference by gr8gonzo -- http://www.experts-exchange.com/Database/MySQL/A_1250-3-Ways-to-Speed-Up-MySQL.html.  In his section on EXPLAIN, he goes through a quick tutorial on what is important to gather from it as well as points to some references and manual page(s) on the topic.  Basically, you are checking if your server is doing a full table scan to find the rows that meet the date criteria or to associate the tables via join.
0
NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

 
LVL 19

Author Comment

by:jools
ID: 35688772
Explain results below (took a while to get back onto the same network). I'll check out the link, thx

Jools

+----+-------------+-------+--------+---------------+---------+---------+------------------------+------+---------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                    | rows | Extra                           |
+----+-------------+-------+--------+---------------+---------+---------+------------------------+------+---------------------------------+
|  1 | SIMPLE      | D1    | ALL    | NULL          | NULL    | NULL    | NULL                   |   31 | Using temporary; Using filesort |
|  1 | SIMPLE      | D2    | ALL    | NULL          | NULL    | NULL    | NULL                   |   31 |                                 |
|  1 | SIMPLE      | A     | ALL    | NULL          | NULL    | NULL    | NULL                   | 1027 |                                 |
|  1 | SIMPLE      | F     | ALL    | NULL          | NULL    | NULL    | NULL                   | 1067 | Using where                     |
|  1 | SIMPLE      | B     | eq_ref | PRIMARY       | PRIMARY | 4       | transport.F.booking_id |    1 | Using where                     |
|  1 | SIMPLE      | T     | ALL    | NULL          | NULL    | NULL    | NULL                   | 1095 | Using where                     |
+----+-------------+-------+--------+---------------+---------+---------+------------------------+------+---------------------------------+
6 rows in set (0.00 sec)

Open in new window

0
 
LVL 19

Author Comment

by:jools
ID: 35689559
I've been messing about on a test server with `join` but I cant seem to get the syntax right, I don't suppose you could elaborate a little on your example?

I will still see if I can get the syntax right while I digest the explain link you posted above.

I'm also trying to understand using indexes in tables but there is an awful lot to take in. I thought I was OK with the simple stuff in MySQL but it seems I have to pick up the more advanced stuff (at least for me) sooner than I'd bargained for.

Thanks for the tips so far.
0
 
LVL 19

Author Comment

by:jools
ID: 35691760
I have added indexes using the example in the link you posted.

I now have links on booking_id on each of the tables and on loc_id on FROM/TOADDRESS and ADDRESS tables, the explain now shows this;

+----+-------------+-------+--------+---------------------------+----------------+---------+------------------------+------+---------------------------------+
| id | select_type | table | type   | possible_keys             | key            | key_len | ref                    | rows | Extra                           |
+----+-------------+-------+--------+---------------------------+----------------+---------+------------------------+------+---------------------------------+
|  1 | SIMPLE      | D1    | ALL    | idx_loc_id                | NULL           | NULL    | NULL                   |   31 | Using temporary; Using filesort |
|  1 | SIMPLE      | F     | ref    | idx_booking_id,idx_loc_id | idx_loc_id     | 5       | transport.D1.loc_id    |   25 | Using where                     |
|  1 | SIMPLE      | A     | ref    | idx_booking_id            | idx_booking_id | 4       | transport.F.booking_id |    1 |                                 |
|  1 | SIMPLE      | B     | eq_ref | PRIMARY                   | PRIMARY        | 4       | transport.F.booking_id |    1 | Using where                     |
|  1 | SIMPLE      | T     | ref    | idx_booking_id,idx_loc_id | idx_booking_id | 4       | transport.A.booking_id |    1 | Using where                     |
|  1 | SIMPLE      | D2    | ref    | idx_loc_id                | idx_loc_id     | 5       | transport.T.loc_id     |    1 | Using where                     |
+----+-------------+-------+--------+---------------------------+----------------+---------+------------------------+------+---------------------------------+
6 rows in set (0.00 sec)

Open in new window


The only thing I'm still not sure of is if what I am actually doing is the right thing to do or if I will be implementing changes that will cause more problems in the future.

So any advice either way will be very much appreciated.

Bear in mind that the issue did not come to light on the test system so whilst I still need to make the changes to the live database and I'm sure the performance will be better I still dont really understand why there was the difference between the VM live system and the test server or if this is just one of lifes random events.
0
 
LVL 60

Accepted Solution

by:
Kevin Cross earned 2000 total points
ID: 35692271
Howdy.  Looks like you made lots of progress judging from the difference in the EXPLAIN results.  If you will be joining this tables often or otherwise filtering the non-bookings tables by a specific booking_id, it should not hurt you to have an index on those fields.  Yes, there is a little added overhead to maintain statistics of the key, but the performance gains on queries should outweigh.  To the extent you are mostly querying/filtering on different criteria, then it may not be as helpful.

The difference in test and live is often the volume of data and/or fragmentation of that data.  In development, we often bulk load data from production or fake information which might be very sequential whereas in production your data may be widely spread.  Therefore, if you have to scan for data each time, its cost may show itself more severely when you are talking about 1m versus 1k rows.

Remember indexes are there to help the query quickly find rows that match a given criteria.  If a columns variability is very low, then an index may not do much to help as essentially you may have 90% of your table with same value and as such end up scanning entire table anyway.  On the other hand, with a date field I would expect there to be a wide range of values and so have an index on arrival_date (if that is a common conditional of your queries) to be very useful as you can quickly seek the rows you are interested in.

With all that said, I would expect that you will need an index on arrival_date also and your query could look like this if you want to use ANSI style joins.

select B.booking_ref as 'ref', B.booking_type as btype, B.customer as 'customer',
          D1.location as 'flocation',     F.address1 as 'fad1', F.address2 as 'fad2', F.address3 as 'fad3',
          F.postcode as 'fpc', F.contact_name as 'fcontact', F.contact_phone as 'fphone',
          D2.location as 'tlocation', T.address1 as 'tad1', T.address2 as 'tad2', T.address3 as 'tad3',
          T.postcode as 'tpc', T.contact_name as 'tcontact', T.contact_phone as 'tphone',
          date_format(B.arrival_date, '%d/%m/%Y') as 'due', A.processed as P, A.approved as 'A'
from BOOKING B
-- if a FROM address may not exist, make this a left outer join
join (
   FROMADDRESS F
   join ADDRESS D1 on F.loc_id = D1.loc_id
) on  B.booking_id = F.booking_id
-- if a TO address may not exist, make this a left outer join
join (
   TOADDRESS T
   join ADDRESS D2 on T.loc_id = D2.loc_id
) on B.booking_id = T.booking_id
-- if an ADMIN record may not exist, make this a left outer join
join ADMIN A on B.booking_id = A.booking_id
where B.arrival_date >= date_sub(current_date(), interval 1 month)
order by P, B.arrival_date, ref
;

Open in new window


Hope that helps.
0
 
LVL 19

Author Comment

by:jools
ID: 35692792
Thanks again for the tips, more things to think about :-)

Is it better to use the `join` or is the `where` clause I used just as efficient? I only use the where clause I do because thats the way I learnt it, if there is a better, more efficient/optimised way using the join then I be happier learning the join syntax properly.

The explain for the join you posted is below for reference.
+----+-------------+-------+--------+---------------------------+----------------+---------+------------------------+------+---------------------------------+
| id | select_type | table | type   | possible_keys             | key            | key_len | ref                    | rows | Extra                           |
+----+-------------+-------+--------+---------------------------+----------------+---------+------------------------+------+---------------------------------+
|  1 | SIMPLE      | D1    | ALL    | idx_loc_id                | NULL           | NULL    | NULL                   |   31 | Using temporary; Using filesort |
|  1 | SIMPLE      | F     | ref    | idx_booking_id,idx_loc_id | idx_loc_id     | 5       | transport.D1.loc_id    |   25 | Using where                     |
|  1 | SIMPLE      | A     | ref    | idx_booking_id            | idx_booking_id | 4       | transport.F.booking_id |    1 |                                 |
|  1 | SIMPLE      | T     | ref    | idx_booking_id,idx_loc_id | idx_booking_id | 4       | transport.A.booking_id |    1 | Using where                     |
|  1 | SIMPLE      | B     | eq_ref | PRIMARY                   | PRIMARY        | 4       | transport.A.booking_id |    1 | Using where                     |
|  1 | SIMPLE      | D2    | ref    | idx_loc_id                | idx_loc_id     | 5       | transport.T.loc_id     |    1 | Using where                     |
+----+-------------+-------+--------+---------------------------+----------------+---------+------------------------+------+---------------------------------+

Open in new window


I will look at implementing the index on the arrival_date and see what I find on Friday (one of those all day meetings tomorrow).

Thanks again for all the info, it's been very helpful.

Jools
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 35693479
Using the ANSI joins will be a little easier to maintain when you need OUTER JOINs.  There is WHERE condition syntax for OUTER JOIN criteria, but again this is a little more clear.  Most optimizers will end up making same query plan as you have seen, but technically what you are doing is creating a cartesian product of the tables involved and then using the WHERE clause to limit the rows to those that have matches only.  So, not a requirement, just more clear as to what is going on in the query.
0
 
LVL 19

Author Comment

by:jools
ID: 35694286
Thanks Kevin,

Easier to maintain is a good reason to use it and a quick google shows a lot of entries for join and ansi but I dont suppose you have a link for a good tutorial that can make this clearer.

I'd rather not do it parrot fashion or all my future queries will now be based on your wonderful example which is great but if I'm going to learn it I'd like to at least try and understand it.

I will of course carry on googling just in case I stumbe upon something useful.
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 35695548
See if Tim's article is what you are looking for.  He does a good job of explaining the join types: http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/A_2685-SQL-Server-Joins-Explained-Logical-Joins.html

He goes through INNER, OUTER (RIGHT|LEFT) and CROSS joins.  CROSS join is the same as using "FROM tablea, tableb" and leaving off a WHERE condition joining the two, i.e., cartesian product.  LEFT/RIGHT joins are good when you have a 1 to 0+ relationship, i.e., you want all rows of one table regardless of if there is a match in another.  INNER joins are for when you know there is always a match OR you only want to see the rows where there is a match.

For more details, the article by Tim should cover it.

Regards,

Kevin

0
 
LVL 19

Author Closing Comment

by:jools
ID: 35707647
Hi Kevin,

Whilst I cant update the live server at the moment (I've implemented a work around by taking out some of the summary page information). I have tested on the test server and it works a treat. As soon as I get some down time on the live server I will implement the changes.

Thanks for your patience and excellent information, you saved me hours of hunting around.
0
 
LVL 60

Expert Comment

by:Kevin Cross
ID: 35707833
You are very welcome and glad that I could help, Jools!
Best regards and happy coding,

Kevin
0

Featured Post

Configuration Guide and Best Practices

Read the guide to learn how to orchestrate Data ONTAP, create application-consistent backups and enable fast recovery from NetApp storage snapshots. Version 9.5 also contains performance and scalability enhancements to meet the needs of the largest enterprise environments.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This post looks at MongoDB and MySQL, and covers high-level MongoDB strengths, weaknesses, features, and uses from the perspective of an SQL user.
Backups and Disaster RecoveryIn this post, we’ll look at strategies for backups and disaster recovery.
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…
In this video, Percona Solutions Engineer Barrett Chambers discusses some of the basic syntax differences between MySQL and MongoDB. To learn more check out our webinar on MongoDB administration for MySQL DBA: https://www.percona.com/resources/we…
Suggested Courses
Course of the Month18 days, 8 hours left to enroll

825 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question