Link to home
Start Free TrialLog in
Avatar of James Froggatt
James FroggattFlag for France

asked on

Speeding up an ajax autosuggest box with a 3.5million record database

I have written an autosuggest function in Javascript (plain javascript).

This autosuggest is a single box that searches a database table of about 3.5 million cities (including towns and villages) - the database actually also details region, and country but this isn't important.

Asking AJAX requests to search such a database takes many seconds per request and this is obviously too slow.

My immediate thoughts on how to speed this up as as follows:

Create separate database tables corresponding to each letter of the alphabet, i.e. 26 tables.

If the first letter the use types is 'a', consult the table with places beginning with 'a' from that point on.

In theory this would make the request 26 times faster?

My first question is:

Is the a sensible approach or am I missing something?

Furthermore, if that is still not fast enough,

take the table, for example all places beginning with 'a', and split that into a further 26 tables; i.e.

All places beginning with aa, ab, ac, ad........ az

My second question is:

Is this a sensible and logical approach? Is there a limit to the number of tables on can have in mysql (there would be 26x26 = 676 tables in this case)?


I am aware of indexing tables to speed them up, so the approach above would be 'on top of' indexing the tables.

I look forward to hearing from you.

thank you
James
ASKER CERTIFIED SOLUTION
Avatar of David Favor
David Favor
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of James Froggatt

ASKER

Hi David,

Here is the schema for the eurocity table:

User generated image

I look forward to hearing from you ... as I digest your reply above.

Thank you
Chris,

Great replies (love the time out so it's not constantly firing ajax!), when you say

Consider adding an index to your table of just the first letter for example, then include that in your query. It will immediately reduce the available records it's looking at

How do I add an index of just the first letter? I though indexes could only be added to entire columns, i.e. city_name_ascii above

Thanks
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
By the way, please let me share some queries with you;

To collate all the infomation for the tables I have I use this;

DROP TABLE IF EXISTS eurofullsearch;
 
create table eurofullsearch(city_name_ascii varchar(255), state_name varchar(20), country_name varchar(20), minnow varchar (255));
ALTER TABLE eurofullsearch CONVERT TO CHARACTER SET utf8 COLLATE utf8_persian_ci;

Open in new window



Note: minnow is a field which is a concatenation for the city, state, and country, as will be seen below

INSERT INTO eurofullsearch SELECT eurocity.city_name_ascii, eurostates.state_name, eurocountry.country_name, CONCAT(eurocity.city_name_ascii,' ,',eurostates.state_name,' ,',eurocountry.country_name) AS minnow FROM eurocity INNER JOIN eurostates ON eurocity.state_code=eurostates.state_code AND eurocity.country_code=eurostates.country_code INNER JOIN eurocountry ON eurocity.country_code=eurocountry.country_code 

Open in new window



An actual SQL search from the autosuggest function

SELECT minnow FROM eurofullsearch WHERE minnow LIKE 'Fa%' ORDER BY CHAR_LENGTH(minnow) LIMIT 10

Open in new window



... and the above SELECT statement is further speeded up by

ALTER TABLE eurofullsearch ADD INDEX(minnow);

Open in new window




Don't get me wrong, this is actually quite fast, but am just interested in getting it as fast as possible.

Thank you
Chris,

my suggestion was to add a new new column (firstLetter) containing just the first letter and index on that

That's a really interesting idea. I will certainly try that out. I'm guessing trying out 'where first two letters is AND' .... i.e. another column that is read when two letters have been typed.... Hmm, very interesting suggestion. That's rather like splitting the whole things into separate tables as I suggested but I'm guessing your idea would be more robust, and probably better.

Thank you
james
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
James,

You'd need to do some benchmarking test, but it should speed up because as soon as you search on an index like that (firstLetter), you're immediately reducing the number of records it has to search on from 3.5million to somewhere like a 26th of that (give or take!)
Chris,

If I create a new column, first_letter, and index that, should I ALSO index the column with the full name in it.


SELECT city_name FROM yourTable WHERE firstLetter = 'a' AND city_name LIKE 'Aber%' LIMIT 15;

i.e.

ALTER TABLE eurofullsearch ADD INDEX(firstLetter );
ALTER TABLE eurofullsearch ADD INDEX(city_name );

or is there only a point in indexing the 'firstLetter' column.

Another question... can indexing ever slow down a table?

I assume it makes sense to INDEX all the columns that are after the WHERE clause. Is that correct?

Thank you
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I am fairly new to EE, I have received truly excellent information from all of you. Is there any way of accepting many answers as solutions. Normal coding questions generally have 'a' solution, but here there's many things to test and try out. How do I accept several solutions?

Thank you
James
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hey James,

Regarding accepting solutions - at the bottom of each comment, there are a few buttons (This is the solutions / This is helpful etc). You can click on those for as many comments as you see fit. If more than one is a solution - then by all means click the This is the solution button on more than one.

Been a while since I've needed to accept a solution, but I think that's how you do it currently :)

Let me know if not.
ste5an,

And for your question about "splitting" your table: this what the database does, when you have an index on the city name column. Thus no need to do it yourself. You cannot do it better then the normal database implementation.

.. with that line, you've saved me a lot of bother!

For a full search I would start the auto-complete when three letters are entered.

.. the village where my 6 year old daugher goes here in France is called "Fa" :)

Thank you
Thank you all,

this has been a most useful discussion for me and I'm not going to implement the coding ideas and see how I go. I will report back here when I have a satisfactory setup.

Right now Davids first post is largely out of reach as I have limited capability to alter server specific settings so will focus largely on indexing and seeing if using the ideas that you've shared with me I can get the search down to accecptable speeds.

Thank you
^ bear in mind that you're never actually searching on the city_name column, so an index on that column won't be particularly useful (based on the queries you've shown)
Good luck with it James