We help IT Professionals succeed at work.

Speeding up an ajax autosuggest box with a 3.5million record database

High Priority
92 Views
Last Modified: 2020-02-03
I have written an autosuggest function in Javascript (plain javascript).

This autosuggest is a single box that searches a database table of about 3.5 million cities (including towns and villages) - the database actually also details region, and country but this isn't important.

Asking AJAX requests to search such a database takes many seconds per request and this is obviously too slow.

My immediate thoughts on how to speed this up as as follows:

Create separate database tables corresponding to each letter of the alphabet, i.e. 26 tables.

If the first letter the use types is 'a', consult the table with places beginning with 'a' from that point on.

In theory this would make the request 26 times faster?

My first question is:

Is the a sensible approach or am I missing something?

Furthermore, if that is still not fast enough,

take the table, for example all places beginning with 'a', and split that into a further 26 tables; i.e.

All places beginning with aa, ab, ac, ad........ az

My second question is:

Is this a sensible and logical approach? Is there a limit to the number of tables on can have in mysql (there would be 26x26 = 676 tables in this case)?


I am aware of indexing tables to speed them up, so the approach above would be 'on top of' indexing the tables.

I look forward to hearing from you.

thank you
James
Comment
Watch Question

Fractional CTO
CERTIFIED EXPERT
Distinguished Expert 2019
Commented:
1) Speed up...

With this amount of data, ways to help will be to...

a) Ensure your server is running bare metal - no cloud tech which uses slow VMs.

b) Research FTS (MariaDB + Full Text Search) indexes, which provide indexing of complex + freeform text.

c) If this process produces huge cashflow, then hire someone to tune your entire LAMP Stack, to run at optimal speed.

2) Is the a sensible approach or am I missing something?

Auto Suggestion certainly works + this is where testing comes in.

So this can be done. Might just take a large budget + long time line, for your project development.

3) Is this a sensible and logical approach? Is there a limit to the number of tables on can have in mysql (there would be 26x26 = 676 tables in this case)?

Guessing what you mean by this, if you try splitting data across many tables, depending how you do this, you'll likely have many performance problems.

Better to collect your data into one table, with FTS indexes, as this type of application is exactly what FTS is built to address.

Note: Best answers for #3 require you publish your entire database schema.
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019
Commented:
A few pointers that may help you get better performance.

Make sure your DB server is optimized and running on a decent platform

Don't use a leading wildcard - %abc% - that won't allow you to use indexes

Make sure your lookup (server-side) code is optimised

Add LIMIT to your queries to max out and end the SQL search

Consider adding an index to your table of just the first letter for example, then include that in your query. It will immediately reduce the available records it's looking at

Debounce your AJAX lookup - this will make sure your lookup is only done after users finish typing rather than for every character typed. You get to set the timeout, so for example, set to 1 sec and as long as users are typing in that period, your loopkup won't fire. That will reduce the number of calls to your server.

Author

Commented:
Hi David,

Here is the schema for the eurocity table:

eurocity tbale structure

I look forward to hearing from you ... as I digest your reply above.

Thank you

Author

Commented:
Chris,

Great replies (love the time out so it's not constantly firing ajax!), when you say

Consider adding an index to your table of just the first letter for example, then include that in your query. It will immediately reduce the available records it's looking at

How do I add an index of just the first letter? I though indexes could only be added to entire columns, i.e. city_name_ascii above

Thanks
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019
Commented:
Hey James,

You're right that indexes can be added to columns - my suggestion was to add a new new column (firstLetter) containing just the first letter and index on that, so your query becomes something like:

SELECT city_name FROM yourTable WHERE firstLetter = 'a' AND city_name LIKE 'Aber%' LIMIT 15;

Author

Commented:
By the way, please let me share some queries with you;

To collate all the infomation for the tables I have I use this;

DROP TABLE IF EXISTS eurofullsearch;
 
create table eurofullsearch(city_name_ascii varchar(255), state_name varchar(20), country_name varchar(20), minnow varchar (255));
ALTER TABLE eurofullsearch CONVERT TO CHARACTER SET utf8 COLLATE utf8_persian_ci;

Open in new window



Note: minnow is a field which is a concatenation for the city, state, and country, as will be seen below

INSERT INTO eurofullsearch SELECT eurocity.city_name_ascii, eurostates.state_name, eurocountry.country_name, CONCAT(eurocity.city_name_ascii,' ,',eurostates.state_name,' ,',eurocountry.country_name) AS minnow FROM eurocity INNER JOIN eurostates ON eurocity.state_code=eurostates.state_code AND eurocity.country_code=eurostates.country_code INNER JOIN eurocountry ON eurocity.country_code=eurocountry.country_code 

Open in new window



An actual SQL search from the autosuggest function

SELECT minnow FROM eurofullsearch WHERE minnow LIKE 'Fa%' ORDER BY CHAR_LENGTH(minnow) LIMIT 10

Open in new window



... and the above SELECT statement is further speeded up by

ALTER TABLE eurofullsearch ADD INDEX(minnow);

Open in new window




Don't get me wrong, this is actually quite fast, but am just interested in getting it as fast as possible.

Thank you

Author

Commented:
Chris,

my suggestion was to add a new new column (firstLetter) containing just the first letter and index on that

That's a really interesting idea. I will certainly try that out. I'm guessing trying out 'where first two letters is AND' .... i.e. another column that is read when two letters have been typed.... Hmm, very interesting suggestion. That's rather like splitting the whole things into separate tables as I suggested but I'm guessing your idea would be more robust, and probably better.

Thank you
james
John TsioumprisSoftware & Systems Engineer
CERTIFIED EXPERT
Distinguished Expert 2019
Commented:
Probably most cities/villages...etc are more than 2-3 letters long..so initiate the search after 2-3 letters are typed
Maybe have a small aux table that holds the most common search results and present them for the initial 2-3 letters typing
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
James,

You'd need to do some benchmarking test, but it should speed up because as soon as you search on an index like that (firstLetter), you're immediately reducing the number of records it has to search on from 3.5million to somewhere like a 26th of that (give or take!)

Author

Commented:
Chris,

If I create a new column, first_letter, and index that, should I ALSO index the column with the full name in it.


SELECT city_name FROM yourTable WHERE firstLetter = 'a' AND city_name LIKE 'Aber%' LIMIT 15;

i.e.

ALTER TABLE eurofullsearch ADD INDEX(firstLetter );
ALTER TABLE eurofullsearch ADD INDEX(city_name );

or is there only a point in indexing the 'firstLetter' column.

Another question... can indexing ever slow down a table?

I assume it makes sense to INDEX all the columns that are after the WHERE clause. Is that correct?

Thank you
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019
Commented:
Hey James,

You only really need to add an index to columns that are used in a WHERE, ORDER_BY or JOIN. Looking at your example queries, there seems to be no benfit to adding it to the city_name column as your not searching on it.

An INDEX can sometimes slow down a table, but as far as I know, only on inserting or updating data - for example if you add a UNIQUE index to a column, it'll need to check that index when you add new records. Never known an index slow down tables for lookups.

Author

Commented:
I am fairly new to EE, I have received truly excellent information from all of you. Is there any way of accepting many answers as solutions. Normal coding questions generally have 'a' solution, but here there's many things to test and try out. How do I accept several solutions?

Thank you
James
ste5anSenior Developer
CERTIFIED EXPERT
Commented:
3.5 million rows in the database should not be the problem. It's more about how auto-complete should work:

For a single letter I would only search the table of the most recent / most used cities. This list can be, when small enough, loaded statically via JS or by a parallel AJAX request during the page load.
For a full search I would start the auto-complete when three letters are entered.
The search itself must be with wildcards only to the right side for optimal index usage.

So in short: you need to "speed up" the search by narrowing it down in the first place.

And for your question about "splitting" your table: this what the database does, when you have an index on the city name column. Thus no need to do it yourself. You cannot do it better then the normal database implementation.
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
Hey James,

Regarding accepting solutions - at the bottom of each comment, there are a few buttons (This is the solutions / This is helpful etc). You can click on those for as many comments as you see fit. If more than one is a solution - then by all means click the This is the solution button on more than one.

Been a while since I've needed to accept a solution, but I think that's how you do it currently :)

Let me know if not.

Author

Commented:
ste5an,

And for your question about "splitting" your table: this what the database does, when you have an index on the city name column. Thus no need to do it yourself. You cannot do it better then the normal database implementation.

.. with that line, you've saved me a lot of bother!

For a full search I would start the auto-complete when three letters are entered.

.. the village where my 6 year old daugher goes here in France is called "Fa" :)

Thank you

Author

Commented:
Thank you all,

this has been a most useful discussion for me and I'm not going to implement the coding ideas and see how I go. I will report back here when I have a satisfactory setup.

Right now Davids first post is largely out of reach as I have limited capability to alter server specific settings so will focus largely on indexing and seeing if using the ideas that you've shared with me I can get the search down to accecptable speeds.

Thank you
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
^ bear in mind that you're never actually searching on the city_name column, so an index on that column won't be particularly useful (based on the queries you've shown)
CERTIFIED EXPERT
Most Valuable Expert 2018
Distinguished Expert 2019

Commented:
Good luck with it James