<

3 Ways to Speed Up MySQL

Published on
94,871 Points
56,071 Views
73 Endorsements
Last Modified:
Awarded
There are ONLY 3 ways to speed up MySQL, and everything else is simply a finer point of one of these 3 ways. Here they are, in order of importance:

1. Optimize queries
2. Tune the MySQL configuration
3. Add more hardware


============================================================
#1. Query Optimization
============================================================
The most common problem with MySQL performance is unoptimized queries. Here are some examples of unoptimized queries:

- Queries that don't use indexes.
- Queries that use SELECT *.
- Queries that search full-text fields.
- Queries that are not properly limited.
- Queries that use ORDER BY unnecessarily.



Indexes
By far, the biggest problem queries are ones that don't use indexes or don't use the BEST indexes. Indexes are the key to getting the best performance out of your queries. Indexes are basically shortcuts for MySQL - they work the same way as an index in a classroom textbook. Let's say you wanted to look up all pages containing "gr8gonzo." You COULD go through every word in the book and find all the pages, but it's far faster to just flip to the index and see that "gr8gonzo" happens to be on pages 2, 6, and 32.

Most people know how to use basic indexes, but most people don't know how to use the BEST indexes. A lot of queries have more than one thing in the WHERE clause, like this:

SELECT fields FROM mytable
 WHERE field1 > 123 AND field2 = 'gr8gonzo';

Open in new window


Most people will have an index for field1 and an index for field2. This is good, and the query will try to make use of one of those indexes (and will be faster). But if this is a query that is run frequently, it would be even better to have ANOTHER index that has BOTH field1 and field2 in it. That (usually) gives you the best query performance.

That said, you don't want to just create tons of these indexes, since each index does take a little bit of extra work for MySQL to update whenever the table changes, and those little bits can add up over time. You should really only create these multi-field indexes when there are frequent, slow queries that COULD take advantage of them. In section 2 of this article, we'll cover some ways of having MySQL tell you what queries need a tune-up, but there is one way to tell immediately if your query isn't using indexes...



EXPLAIN-ing Queries
If I wanted to see if the above query was working well, I could use EXPLAIN to do it. When you EXPLAIN a query, you're simply asking MySQL to tell you what it WOULD do if it ran the query for you. It responds with a computerish version of "Well, in order to run your query, I would use this index. That would leave me with X rows, which I would then look at in order to figure out which ones you wanted."

To EXPLAIN a query, all you have to do is run the same query but put "EXPLAIN" in front of it:

EXPLAIN SELECT fields FROM mytable 
 WHERE field1 > 123 AND field2 = 'gr8gonzo';

Open in new window


The result looks something like this:

+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+ 
| id | select_type | table   | type | possible_keys | key  | key_len | ref  | rows   | Extra       | 
+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+ 
|  1 | SIMPLE      | mytable | ALL  | PRIMARY       | NULL | NULL    | NULL | 898256 | Using where | 
+----+-------------+---------+------+---------------+------+---------+------+--------+-------------+
.

Open in new window


WHOA! At first glance, this is probably really confusing, but you can often just ignore a lot of the information, like id, select_type, table, type, and ref. And MySQL sometimes calls indexes "keys", so now let's look at the same result but without the extra columns:

+---------------+------+---------+------+--------+-------------+ 
| possible_keys | key  | key_len | ref  | rows   | Extra       | 
+---------------+------+---------+------+--------+-------------+ 
| PRIMARY       | NULL | NULL    | NULL | 898256 | Using where | 
+---------------+------+---------+------+--------+-------------+ 

Open in new window


Basically what this says is that MySQL had to go one-by-one through 898,256 rows and check each one to see if field1 > 123 and if field2 = 'gr8gonzo'. That's a lot of processing to do, especially if the final result is just a couple of rows (meaning that there are nearly 900,000 rows that get searched uselessly). Let's try adding in an index for one of those fields:

ALTER TABLE `mytable` 
 ADD INDEX `IDX_FIELD1` (`field1`) ;

Open in new window


If we re-run the EXPLAIN, we'll see:

+---------------+---------------+---------+-------+------+-------------+ 
| possible_keys | key           | key_len | ref   | rows | Extra       | 
+---------------+---------------+---------+-------+------+-------------+ 
| IDX_FIELD1    | IDX_FIELD1    | 5       | const | 1246 | Using where | 
+---------------+---------------+---------+-------+------+-------------+

Open in new window



Well, now we're down to only looking at 1,246 rows. That's much better than 898 thousand, but we can do even better. Our query uses two fields in the WHERE clause, so we can probably gain better performance by adding in an index containing BOTH those fields:

ALTER TABLE `mytable` 
 ADD INDEX `IDX_FIELDS1_2` (`field1`, `field2`) ;

Open in new window


...and now re-run the EXPLAIN and we get.

+---------------+---------------+---------+-------------+------+-------------+ 
| possible_keys | key           | key_len | ref         | rows | Extra       | 
+---------------+---------------+---------+-------------+------+-------------+ 
| IDX_FIELDS1_2 | IDX_FIELDS1_2 | 5       | const,const | 16   | Using where | 
+---------------+---------------+---------+-------------+------+-------------+

Open in new window



Voila! Now when we run the exact same query for real, we know that MySQL only has to search through 16 rows instead of nearly 1 million. A guaranteed speed increase, and it was free!

NOTE: In the above output, "possible_keys" will sometimes show more than one index, indicating that there's more than one choice that could help the query run faster. However the "chosen" index will be in the "key" field. The "ref" can give you an idea of how many fields are involved in the index. For example, if you have an index on one field, your "ref" column will probably just say "const" but if you have an index on two fields and both of those fields are in the WHERE clause, then you'll probably see "const,const" in the "ref" column.

ANOTHER NOTE: Whenever MySQL has to look at every row in the table, it's called a "table scan." Table scans are the slowest way MySQL can look for data. When you EXPLAIN a query, look at the "type" column - if it says "ALL" then MySQL is doing a table scan to find your data. If it says something else, like "range", then it is making use of an index. Occasionally, on small tables, MySQL will do a table scan even if you have an index. This is just MySQL knowing what is best in that situation, but you normally want to avoid these. Here's a link to the MySQL documentation on avoiding table scans:
http://dev.mysql.com/doc/refman/5.0/en/how-to-avoid-table-scan.html

There's a lot of in-depth optimization stuff on how to use EXPLAIN. If you feel like reading documentation, check here:
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html

I also recently found and highly recommend a free MySQL manager application called HeidiSQL that (among other things) makes it easy to create and update your indexes. Plus, when you add indexes, it will show you the SQL code it ran to create those indexes, making it a useful learning tool.
http://www.heidisql.com/

There's also phpMyAdmin, which is installed on a lot of web hosts:
http://www.phpmyadmin.net


Using SELECT *
I'm guilty of it. It's far easier to write queries that use SELECT * and not have to worry about typing out 10 fields names, but it COULD be the culprit that slows down your web application. Here's a common mistake:

Let's say you run a web site that collects stories written by your members. All the stories are put into one big table called stories. So far so good. But now let's say you have a query out there that is used to create a menu to link to all the stories:

SELECT * FROM stories;

Open in new window


Well, if the CONTENTS of each story is in the stories table, then whenever you run the above query, MySQL is also sending every letter of every story in the system back to your script. If you have 1,000 stories that are about 10k each, then everytime someone views the menu, your script is downloading 10 megabytes of extra data that it just throws away without using. What a waste!

Instead, try changing your query to something like:
SELECT storyID,storyName,storyDate FROM stories;

Open in new window


Now we're only selecting a few fields that we need for the menu. Get into the habit of specifying ONLY the fields your script needs, and you'll find that it's easier than you think, and your scripts WILL run faster.

TIP: There's a quick way to see a summary of all the fields in the table and what type of field they are:
DESCRIBE mytable;

Open in new window



The Full Text
Let's stick with the "stories" example above. People will probably want to search through stories for specific words. If your story content is in a full-text field (e.g. TEXT datatype), then chances are that you're searching like this:

SELECT storyID FROM stories
 WHERE storyContent LIKE '%fondled the hubcaps%';

Open in new window

 
This probably runs quickly when you don't have many stories, but it'll get slower and slower and slower over time. In this case, consider an open-source product called Sphinx Search:
http://sphinxsearch.com/

It specializes in taking your full-text content and making it searchable. A query that takes 10 seconds to run in MySQL could take 0.1 seconds in Sphinx, and that is not an exaggeration. The downside is that it is a separate program / daemon, and requires a bit of know-how to set up and get running, but it's worth the time. They have community forums to help, and some people here at Experts Exchange (like me), can also help.



Add LIMITs
This one's simple - if you only need a couple of rows out of thousands that are being returned (e.g. getting the top 10 of something), then add a LIMIT clause to the end of your query:

SELECT storyID FROM stories
 ORDER BY storyRating DESC
 LIMIT 10;

Open in new window


It can sometimes be useful to run a query that counts the number of rows in your result before pulling all of them. This can give you an idea of how to limit your rows or how to run your next query (although this largely depends on your particular situation). Here's a way to quickly get the number of stories from our example:

SELECT COUNT(storyID) AS storyCount FROM stories;

Open in new window


The results will be a row containing a field called "storyCount." This type of technique becomes more useful as your database grows larger and larger.



The ORDER BY Dilemma
Using ORDER BY is great for sorting, but sometimes it can create real slowdowns on MySQL. When you ORDER BY a field, MySQL first finds all the rows that will be in your results, and THEN goes back and re-orders them according to that ORDER BY field. If you have a lot of rows, then MySQL has to do a lot of re-ordering, which can be very slow.

In the above example on LIMITs, the query would have to sort every single story by its rating before returning the top 10. However, if I know that all of the top 10 stories have a rating of 4 or higher, then I could reduce the number of stories to be sorted like this:

SELECT storyID FROM stories
 WHERE storyRating >= 4
 ORDER BY storyRating DESC
 LIMIT 10;

Open in new window


Now MySQL may only have to sort through 100 stories instead of 10,000.

Sometimes it's worth asking yourself whether you REALLY need to use ORDER BY at all. Sometimes it's faster to skip the ordering info altogether on the database and use PHP or something else to handle the sorting (although MySQL is usually faster at it).

One other trick is to create an index of the fields you're SELECTing and ORDERing BY. So if you had a query:

SELECT storyID,storyRating FROM stories
 ORDER BY storyRating DESC;

Open in new window


Then, the query could benefit a lot from an multi-field index of storyID and storyRating.



============================================================
#2. The MySQL Configuration
============================================================
There are a LOT of ways to configure MySQL, but all of them start with the my.cnf configuration file (that's usually what it's called). Generally speaking, you can tune up MySQL by telling it to cache things in memory. When it stores any data in memory, MySQL can access it almost instantly instead of having to go back to the full database on the hard drive and look up the requested data (which is slow).

Here's an example section of a my.cnf file (I've taken out some extra parameters that weren't performance-related and some others that I won't discuss in this article):

[mysqld]
skip-name-resolve

query_cache_size = 16M

log-slow-queries=/var/log/slowqueries.log
long_query_time = 4
log-queries-not-using-indexes

table_cache = 512
tmp_table_size = 128M
max_heap_table_size = 128M
myisam_sort_buffer_size = 8M
sort_buffer_size = 8M
join_buffer_size = 256K
key_buffer = 128M

Open in new window


The first thing I always do is disable name resolution (skip-name-resolve). Basically, name resolution just tries to look up a "caller ID" on whoever is connecting to the database. I still don't know why it's enabled by default. It's not only a potential security problem, but it's usually unnecessary for most web server setups (since the web server is the one that does the connecting, not the visitors), and it has the potential to crash the system (if your DNS goes down for a while and MySQL gets filled up with connections that are waiting to be "resolved").

Next, enable the query cache (query_cache_size). In the above example, I've got a 16-megabyte query cache. Basically, if I run a query that takes 5 seconds to run, and then I refresh a page or something (causing the query to run again), then the query will run instantly because MySQL will remember the results of the query from the first time. If the tables involved in the query get changed, though, then it will clear any cached results that use those tables (so you're always getting accurate data). Start with a 16-megabyte cache and work your way up as necessary (I'll explain in a bit how to tell when to increase the cache).

Third, enable the slow query log (log-slow-queries and long_query_time and log-queries-not-using-indexes). This tells MySQL to keep track of all the queries that take longer than a certain number of seconds (long_query_time) to complete. The log-queries-not-using-indexes option also includes queries that don't use indexes (simple enough). Just let the log sit for a day or two while you use your application, and then look at it to find all the queries that need to be optimized.

The last section of lines have several different purposes (caching joins, ORDER BY results, temporary tables, etc), which all affect speed, but it's difficult to know exactly what values to use sometimes. That's why I recommend using MySQLTuner:
http://wiki.mysqltuner.com/MySQLTuner

It's a Perl script that you just download and run on your database server after letting your server run for a few days (without restarting). The script will look at all the statistics that MySQL collects and will make recommendations on what to change in your my.cnf file to make things run better (like increasing query cache size or table_cache and that type of thing). It's pretty straightforward and doesn't take long to run.



============================================================
#3. Instant Speed! Just Add Hardware!
============================================================
This is usually the most obvious answer. Upgrade to a faster CPU, add more RAM, etc... and you'll run faster. This is true, but there are a few things to know first.

First, the NUMBER of hard drives is more important than the SPACE. Some people make the mistake of getting two 1-terabyte drives and just using those to run their database server. By adding multiple hard disks in a RAID array (which is what most servers use anyway), you're effectively distributing the load.

If two queries are running at the same time and you only have two hard drives, then there's a good chance that the data for both queries is located on the same hard drive. Since a hard drive can only do one thing at a time, one of the queries will have to wait a little bit longer for the other one to finish before it can run. But if you have, say, 6 hard drives or more (the more the merrier), then one query might need data from Hard Drive #2 while the second query needs data from Hard Drive #5. Both hard drives can work at the same time and send data back nearly simultaneously. At least that's the gist of it, so spend money on multiple, fast hard disks for that extra bump in speed. Hard disks are usually the biggest speed bottleneck anyway (hardware-wise).

Last point on hard disks - if someone else is setting up your server and wants to know what RAID level you want to use, try to use RAID 10 (safe + good performance). Otherwise, use RAID 1 (safe). Other RAID levels have their advantages and disadvantages, but those are my standard recommendations.

Second, there's usually not a direct correlation between RAM and speed. Just because you add more RAM doesn't mean the system automatically uses it (or uses it right). If you've got several gigabytes of RAM already, then any additional RAM should probably go towards bigger caches. There are other uses, as well (like increasing the number of maximum connections), but if you're reading this article, then you may not be at that point yet anyway.

Third, CPU is sort of like luck - it affects everything a little bit, and affects some things a LOT. If you're writing a math-hungry application, crunching statistics, etc... then investing in the latest and greatest CPU and motherboard may be a good choice. If it's just your standard web/database server, then there are probably better ways of spending your money (-cough- more hard drives -cough-).

Finally, if one server just can't handle all the traffic, consider setting up another server with dual-master replication for a quick-n-dirty way of load-balancing. (Note - replication doesn't actually do load-balancing, it just keeps two servers in complete, realtime sync so you can send 50% of visitors to one database server and the other 50% to the other server. It also makes for a handy backup / failover system.)
http://www.neocodesoftware.com/replication/


FINAL TIP: A lot of places will have a test database server and a real database server, but the test database server only has very few rows in it (just enough for a basic test). This makes it easy to misjudge your application's REAL performance. Try to make sure that your test database has similar data to your real database so you get a better picture of how your queries will perform in the real world. Many MySQL manager programs like phpMyAdmin and HeidiSQL make it easy to download all the data from your real database so you can upload it into your test database. (There's also a command line tool called mysqldump.)

That's it! If you've read this far, then you know all the major (and some minor) steps in improving your MySQL performance, and you're now further ahead than most others who are still trying to read through those boring, thick manuals. If you run into any trouble with any of the points made here, please feel free to comment and/or post questions in the Experts Exchange MySQL area.
73
Comment
Author:gr8gonzo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
31 Comments
 
LVL 60

Expert Comment

by:Kevin Cross
Another excellent article, gr8gonzo!
0
 
LVL 20

Expert Comment

by:alainbryden
Absolutely brilliant, professional quality advice for everyone, whether they're a lowly programmer, or a data-warehouse manager. Well done!
0
 
LVL 58

Expert Comment

by:tigermatt
Great article, thanks. I do feel some info on the types of RAID array would enhance the section on hard disks - in particular, it is worth mentioning plenty of mirrors (RAID 1 or RAID 10) arrays are great -- and RAID 5 can be a performance hit in most configurations.

I don't know how MySQL handles transaction logs, so please correct me if I am wrong or it doesn't use them: for most database applications, you should store the 2 separate components - transaction logs and the actual database files - on separate arrays (not partitions, arrays). This improves performance and also helps with disaster recovery.

-Matt
0
What is SQL Server and how does it work?

The purpose of this paper is to provide you background on SQL Server. It’s your self-study guide for learning fundamentals. It includes both the history of SQL and its technical basics. Concepts and definitions will form the solid foundation of your future DBA expertise.

 
LVL 35

Author Comment

by:gr8gonzo
Hi Matt,

Regarding RAID: RAID 1 actually should perform worse than RAID 5. RAID 1 is mirroring, so every byte is written twice (which means it's safer). RAID 5 in its most minimal configuration is slightly faster than RAID 1, but gets faster as you add more drives. RAID configurations are a whole 'nother article, but if given the choice, I always use RAID 10 - it has the performance of RAID 0, while having the safety of RAID 1.

MySQL has different storage engines. The MyISAM engine, for example, does not require/use transaction logs (unless you count the optional binary log). The InnoDB engine, on the other hand, uses a set of files for transaction logs, and is much more like MS SQL Server (judging by your wording, I'm guessing that's what you use). Ultimately, separating transaction logs and database files is the same concept of spreading the load across disks. You can separate them on different RAID arrays for even better performance, but that was a bit beyond the scope of the article. Good points, though. :)
0
 
LVL 58

Expert Comment

by:tigermatt

Agree. I always use RAID 10. My experience is indeed in SQL Server and Exchange, so that's where my transaction log thinking came from.

-Matt
0
 
LVL 54

Expert Comment

by:b0lsc0tt
Great article and I loved the details.  The approach and flow was something that made this fun to read and easy to review.  The examples are well thought and the sections, in my opinion, cover the common issues.  Both are done in a way so even the non-DBA can learn from this and apply what you teach.  Thanks for the time and work on this contribution!

bol
0
 
LVL 15

Expert Comment

by:angus_young_acdc
Great article.  Very informative (and now bookmarked by myself) and it has given me a few handy hints for my own DB work.
0
 

Expert Comment

by:intlgd
Thanks a lot! Very nice.
0
 
LVL 70

Expert Comment

by:Jason C. Levine
Excellent.  I've already fixed a bunch of slow queries based on this.
0
 
LVL 49

Expert Comment

by:DanRollins
Excellent article!  Thanks.
0
 
LVL 2

Expert Comment

by:b10w01f
Thanks for input,
Very nice article. ^_^
0
 
LVL 4

Expert Comment

by:Odengr
The MySQL Tuner link is broken (he change his site to new one).

New URLfor now: http://blog.mysqltuner.com/MySQLTuner


Great Article! :)
0
 

Expert Comment

by:driver_x
awesome article very helpful. I have posted a question at http://www.experts-exchange.com/Database/MySQL/Q_24673292.html i think its one that you would be able to answer very easily and would enjoy your input
0
 
LVL 7

Expert Comment

by:marklogan
You may also want to consider covering

USE INDEX, IGNORE INDEX, and FORCE INDEX

http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
0
 
LVL 35

Author Comment

by:gr8gonzo
Hi markogan,
Thanks for reading and for the comment. One reason that I did not cover those is because 99% of the time, MySQL is smart enough to pick the right indexes. In my opinion, they're really only useful in very specific scenarios that are not common to most users and their databases. If you happen to have a huge table with many indexes that partially overlap each other, they could come in handy, but again, it's a pretty narrow scenario for most people.

Information about advanced techniques like that should probably come from things like the glorious MySQL performance blog (www.mysqlperformanceblog.com) - they would probably do a far better job at covering it than an extra paragraph in this article, too. :)
0
 

Expert Comment

by:taskmgr
gr8gonzo  do you offer paid mysql service , we need urgent help.

Thanks
0
 

Administrative Comment

by:ModernMatt
http:#c8576 edited to remove contact information.

ModernMatt
Experts Exchange Moderator
0
 
LVL 6

Expert Comment

by:ioannisa
Congratz on this article.
Its very good and useful!
0
 

Expert Comment

by:cnuguri
Thanks gr8gonzo !!
Very useful article.
0
 
LVL 3

Expert Comment

by:dnadavis
Very useful article for me...Thanks alot
0
 
LVL 32

Expert Comment

by:Daniel Wilson
Awesome!  I'm going to use EXPLAIN a lot, I think!
0
 
LVL 7

Expert Comment

by:haijerome
Hi gr8gonzo,

        This article

                  explains well about Explain,
                   
                   gives short and sweet text on FULL TEXT

                   And nothing to wonder, it stands as the top row if it is ordered by top 10 articles on MySQL server ...

                    And  I indexed it on my Knowledge base

           
       Hope you come out with much more articles regarding MySQL server on various dimensions with much more insights which will be very helpful to novice like myself .....


Thanks and Regards,
 Jerome Dennis D

             

 
0
 

Expert Comment

by:xNejX
fantastic!
0
 
LVL 10

Expert Comment

by:Derokorian
Try to make sure that your test database has similar data to your real database so you get a better picture of how your queries will perform in the real world.

I couldn't agree with this statement more. Not just for load judging, and performance measurements, but also for final lay-out and design judgement as well. Personally my development database almost always mirrors the exact database of the live server (sometimes its a few days behind, or ahead if I'm adding new structure). It also helps when developing a new table / section for a website because then instead of adding all those rows to the live after only having a few in the development, you know have exactly what the live should look like and can just export the table from the dev db and import it to the live db.
0
 
LVL 21

Expert Comment

by:Jason Yousef, MS
Great article, and thanks for voting for my article :)
0
 
LVL 1

Expert Comment

by:DigitalNam
Wow thank you. This was very informative.
0
 
LVL 36

Expert Comment

by:Loganathan Natarajan
Thanks a lot.It helped me.
0
 
LVL 35

Author Comment

by:gr8gonzo
Just a couple of additional thoughts - Sphinx isn't the only full-text search engine that I should have mentioned. There are several others, such as Solr.
0
 
LVL 2

Expert Comment

by:ldvhai
So good for me to read this article.

Thanks a lot.
0
 
LVL 2

Expert Comment

by:Jason Parms
Thank you for the sharing. It helps me a lot to get more information about MySQL for speed up performance.
0
 

Expert Comment

by:Member_2_7965240
Thank you for the great article.

If I have only 1 row I want to find there is no point using limit, am I right?


SELECT story FROM stories
 Where storyID =510
 ;
is just the same as:
SELECT story FROM stories
 Where storyID =510
 LIMIT 1;

What do you think?

Thank you
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Join & Write a Comment

If you're a developer or IT admin, you’re probably tasked with managing multiple websites, servers, applications, and levels of security on a daily basis. While this can be extremely time consuming, it can also be frustrating when systems aren't wor…
Monitoring a network: how to monitor network services and why? Michael Kulchisky, MCSE, MCSA, MCP, VTSP, VSP, CCSP outlines the philosophy behind service monitoring and why a handshake validation is critical in network monitoring. Software utilized …

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month