Fast search of 13 char indexed text field for 1 Million+ record table in Mysql InnoDB table

I have a table with 1 Million+ records that I need to search with a 13 char string.

Right now, the search is prohibitively long (3.5 seconds to retrieve a record) with about 850k records.

It's an InnoDB Database and the search field is always 13 chars long and a standard (not Primary) index with no special settings.

This is on a Linux Server.
Who is Participating?
gr8gonzoConnect With a Mentor ConsultantCommented:
1. Follow my article, especially the part on tuning the configuration. RAM is good for caching data and query caching, but if you're getting hit with different IDs to search each time, then your query cache isn't going to help you as much.

2. A numerical index definitely has the POTENTIAL to be faster, but it's not necessarily true. Depending on your data, the split-column approach might definitely assist MySQL, though. If you can, just dupe the database, create two columns and split the data into the two columns so you have records like:

OLD: "1234567890123" (1 varchar column)
NEW: 1234 and 567890123 (2 int columns)

A simple query with SUBSTRING() and CAST() should populate the columns quickly so you can run your tests, but make sure you do it on a copy of the database, not on your original.

Then create a multi-column index on the two columns and run your SELECT (and make sure you EXPLAIN it so you can make sure your new index is used).

You don't need PHP to understand the full 13 characters as a big number - just substr() it to get the different parts:

$old = "1234567890123";
$part1 = substr($old,0,4);
$part2 = substr($old,4);
$query = "SELECT fields FROM table WHERE bookID1 = {$part1} AND bookID2 = {$part2}";
So you have one specific field that is always 13 characters long, and you're always searching this field with a 13 character search string?

With that kind of setup, a million records shouldn't take more than a couple seconds to search, so I wonder:

1. Are you pulling back more data than you need? If your SELECT results in lots of records containing hundreds of kilobytes of data, part of the "slowness" might be data transfer overhead.

2. Is the field definition longer than 13 characters and if so, did you specify the index length? For that matter, have you tried different index lengths?

3. Is the database sitting on a slow or busy hard drive? I/O tends to be the bottleneck most of the time, and if the hard drive controller is busy serving data to something else, it's going to take time to even get to the search request.

4. If the data in the search column changes infrequently (e.g. daily), you might consider a separate, full-text search engine like Sphinx or Solr, which can get you very specific results back in a 1/10th of a second (the caveat is that they have big indexes and the indexing is a slow operation that tends to be something you do daily). This seems like overkill for your situation.

5. If nothing else helps (also see my article below), a potentially radical move would be to split your 13-character string up into two columns (e.g. 3 character and 9 character columns), with a multi-column index. Depending on what your data looks like, this might help MySQL look up and organize data faster.

Check out my article on this topic:
Ray PaseurCommented:
Please post the CREATE TABLE statement and the SQL statement, thanks.
Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

genepilot2Author Commented:
To answer gr8gonzo's questions:
1. The data is about 1.4 GB in size in the database, but there is 4GB of RAM available for the Database.
2. The field is a varchar(13) and the data is always 13 digits long, all numeric.
3. This index is used heavily so the index will always be in memory.
4. This field doesn't change for existing records, but new records are added daily (about 10-20k per day as visitors ask for new books that aren't already in the system) during the busy period.
5. Splitting the index is one possibility, is a numerical index much faster than a string index?  I know that a 13 digit number is too big for a 32-bit server to use with PHP even if mysql can handle that size with it's largest int.

Ray PaseurCommented:
We can create test cases if you post the CREATE TABLE statement and the SQL statement that is slow.  With a little test data we might be able to create some simulations, right? It will be easier to help if we have that small advantage.
Also, don't forget to run maintenance on the table. And if you can afford to do so, consider a dedicated SSD drive for the database. A couple hundred bucks nowadays can buy you a couple of moderately-sized SSDs that you can put in RAID 1 (mirroring for safety) and get you a huge boost in DB performance. If you're curious about how much difference I/O can make, you can always create a temporary RAMdisk and copy your database to it, then run queries against it (but that's just for proof of concept - never put production data onto a RAMdisk since any reboot or power loss means that RAMdisk is gone - it's just good for testing before you spend any money). A RAMdisk will be faster than an SSD, but you should still get an idea of the possibilities.

Finally, when you test, don't test from within your application (just in case code is slowing anything down). Open up phpMyAdmin or HeidiSQL or the shell or whatever you use and run the query directly to make sure you're getting accurate performance readings.
And Ray is right - test data will help us help you, too, if you can share that!
I just created a test database with a varchar(13) field called "serial":  
CREATE TABLE `books` (
	`serial` VARCHAR(13) NOT NULL DEFAULT '0',
	INDEX `serial` (`serial`)

Open in new window

Then I inserted about 2 million rows into it (about 200 megs of data). I pulled a random serial number from around the middle of the database (about 500,000 rows in), and performed a search on it:
SELECT * FROM books WHERE serial = '4397388423728'

Open in new window

As expected, the record came back lightning-fast - 0.062 seconds.

The only way I could get MySQL to run a really long query was to use LIKE with wildcards, like this:

SELECT * FROM books WHERE serial LIKE '%4397388423728%'

Open in new window

That resulted in the same record but in about 9 seconds. That said, I'm hoping that is not what you're doing, since you indicated that your search field is 13 chars long and the search string is 13 chars long, so you should only ever be doing a complete equals operator.
Ray PaseurCommented:
No CREATE TABLE statement and no test data...?  To make your question and answer dialogs more effective you might want to learn about the SSCCE.  It's the process and tool professionals use when they want to get help.  You'll be amazed how much faster you get good results when you do that!

Best of luck with your project, ~Ray
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.