Solved

how to design and impelemnt a search algorithm

Posted on 2006-11-30
5
182 Views
Last Modified: 2013-12-04
I have a postgres database with one table storing:
street number
pre_dir (such as north, south, etc)
a field for a street name
post_dir
suffix (pulled from a list of options.  such as beach, blvd, street etc)
unit #
city
state
zip
borrower first name
and last name

i would like to write something that would catch possible dupes, and i was thinking of using levenshtein distance.  does anybody have any suggestions on what else can be done?
0
Comment
Question by:tansofun
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
5 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 18050540
You might want to start by normalizing the address, making sure any abbreviations are consistent
Then you might check it the address against the post office data base
http://www.usps.com/ncsc/addressservices/addressqualityservices/addresscorrection.htm
(I'm guessing based on state zip that these are USA addresses)
Do you want to catch typographical errors and misspellings?
0
 
LVL 2

Author Comment

by:tansofun
ID: 18050606
usps is a great suggestion.  it'll work most of the time;we also deal with addresses that are new and usps may not have them yet, oddly enough.

do you have any suggestions on typo's?  I don't really want to build any big n-gram tables and things of that nature.
0
 
LVL 84

Expert Comment

by:ozo
ID: 18050764
Levenshtein distance basically counts typos, so it could be useful.
A more sophisticated model may give higer probabilities to substititions of adjacent keys,
or letters that sound alike.
If your last names are European, soundex may be useful for matching them
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question