Link to home
Start Free TrialLog in
Avatar of lucavilla
lucavillaFlag for Italy

asked on

Similarity search through 10000 italian city names

I have a list of 10000 italian city names.
I would like to "fuzzy search" through them.
What's the most popular or fast algorithm for doing it?
Avatar of Markus Fischer
Markus Fischer
Flag of Switzerland image

Try the Levenshtein distance. See http://en.wikipedia.org/wiki/Levenshtein_distance
It's especially good for languages like Italian.

(°v°)
Avatar of lucavilla

ASKER

I need to contain every search within 2 seconds of CPU time. Wouldn't the Levenshtein algorithm be too slow to be used on 10000 words every time?
ASKER CERTIFIED SOLUTION
Avatar of Markus Fischer
Markus Fischer
Flag of Switzerland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
agrep satisfies the requirements, but this may also be of interest  http://www-db.deis.unibo.it/Mtree/
Forced accept.

Computer101
EE Admin