combine misspelled user input fields in a DISTINCT query

Posted on 2009-04-27
Last Modified: 2013-12-12

the query: "SELECT DISTINCT `City` FROM `residential` ORDER BY `City`  ASC";

the above query returns distinct values as is should but the city field in the database is user defined and sometimes misspelled witch causes a second, third, and froth value for that city.
then when trying to use that misspelled city to get the ares of town form the database only the area of town from that misspelled entry are displayed.

query for area of town: "SELECT DISTINCT `Area` FROM `residential` WHERE `City` LIKE \"%$city%\" ORDER BY `Area`  ASC";

Is there any way to combine the distinct results that are similar? Or am i going about this the wrong way?
Question by:Cybervanes
    LVL 65

    Assisted Solution

    fuzzy matching may help you here, possibly use of soundex, bear in mind, it can be very slow.
    Further info on that here
    LVL 2

    Author Comment

    example of results:
    citys= Any City |Alamosa|Allison|Antonito|Arboles|Austin|Aztec|Battlement Mesa|Bayfield|Bedrock|Blanca|Cahone|Capulin|Cedaredge|Center|Chama|Chromo|Cimarron|Clifton|Collbran|Colona|Conejos|Cortez|Cortez,|Cortz|CORY|Crawford|Creede|Crested Butte South|Crestone|DE BEQUE|DeBeque|DEL|Del Norte|Delta|Dol|Dolores|Douglas Pass Area|Dove Creek|Duango|Duramgo|Durango|Durango,|Eagle|ECKERT|Farmington|Forbes Park|Fort Garland|Fruita|FRUTIA|FT GARLAND|Ft. Garland|Gand Junction|Garcia|GARND JUNCTION|Gateway|Glade Park|Grand Jct|Grand Jucntion|Grand Junciton|Grand Junctio|Grand Junction|Grand Junction,|Grand Junction, CO|Grand Junctionb|Grand Juntion|Gunnison|Hersperus|Hesperus|Hooper|Hotchkiss|Ignacio|Ignaico|Jasper|La Garita|La Jara|LaGarita|Lake City|Lake Powell|Lewis|Loma|MACK|Manassa|Mancos|Marble|Maybell|Meeker|MESA|Middle Mesa|Moffat|MOLINA|Monte Vista|Monticello|Montose|Montroe|Montrose|Montrose (Colona)|Montrose,|Montrpse|Mosca|Mountain Village|Mt Crested Butte|Naturita|New Castle|Norwood|NUCLA|Olathe|Our|OURAY|Ouray County|Oxford|Pagosa|PAGOSA SPRING|Pagosa Springs|Palisade|Palisade, CO|Palisdae|Paonia|PARACHUTE|PARADOX|Placerville|Pleasant View|PleasantView|Powderhorn|Rangely|redvale|Rico|Ridgway|Ridway|RIFLE|Saguache|San Acacio|San Luis|Sanford|Silt|Silverthorne|Silverton|Somerset|South Fork|Taos|TBD|Telluride|Tierra Amarilla|Villa Grove|Weson|Whitewater|Yellow Jacket|
    LVL 2

    Accepted Solution

    although rockiroads: solution's probably would have worked i decided to fix the spelling when the database is populated.

     i have attached my code.
    // data9 is the city field
    $data[9] = strtolower($data[9]);
    $data[9] = ucwords($data[9]);
    // input misspelled word
    $input = $data[9];
    // array of words to check against
    $words = array("Grand Junction","Fruita","Alamosa","Allison","Antonito","Antonito","Austin","Aztec","Arboles","Battlement Mesa","Bayfield","Bedrock","Blanca","Cahone","Capulin","Cedaredge","Center","Chama","Chromo","Cimarron","Clifton","Collbran","Colona","Conejos","Cortez","Crawford","Crested Butte","Crestone","Creede","DeBeque","Del Norte","Delta","Dolores","Douglas Pass Area","Dove Creek","Durango","Eagle","Eckert","Farmington","Forbes Park","Fort Garland","Garcia","Gateway","Glade Park","Gunnison","Hesperus","Hesperus","Hotchkiss","Hooper","Ignacio","Jasper","LaGarita","La Jara","Lake City","Lake Powell","Lewis","Loma","Mack","Manassa","Mancos","Marble","Maybell","Meeker","Mesa","Middle Mesa","Moffat","Molina","Monte Vista","Monticello","Montrose","Montrose (Colona)","Mosca","Mountain Village","Mt Crested Butte","Naturita","New Castle","Norwood","Nucla","Olathe","Ouray","Oxford","Pagosa Springs","Palisade","Paonia","Parachute","Paradox","Placerville","Pleasant View","Powderhorn","Rangely","Redvale","Rico","Ridgway","Rifle","Saguache","San Acacio","San Luis","Sanford","Silt","Silverthorne","Silverton","Somerset","South Fork","Taos","Telluride","Tierra Amarilla","Villa Grove","Weson","Whitewater","Yellow Jacket");
    // no shortest distance found, yet
    $shortest = -1;
    // loop through words to find the closest
    foreach ($words as $word) {
        // calculate the distance between the input word,
        // and the current word
        $lev = levenshtein($input, $word);
        // check for an exact match
        if ($lev == 0) {
            // closest word is this one (exact match)
            $closest = $word;
            $shortest = 0;
            // break out of the loop; we've found an exact match
        // if this distance is less than the next found shortest
        // distance, OR if a next shortest word has not yet been found
        if ($lev <= $shortest || $shortest < 0) {
            // set the closest match, and shortest distance
            $closest  = $word;
            $shortest = $lev;
    //echo "Input word: $input  --  ";
    if ($shortest == 0) {
        //echo "Exact match found: $closest <br/>";
    } else {
        //echo " <font color=\"blue\" size=\"+1\">Did you mean: $closest? </font><br/>";
    	$data[9] = $closest;
    	$errors .= "changed MLS#" . $data[0] . " city from " . $input . " to " . $closest . "<br/>";

    Open in new window


    Featured Post

    What Security Threats Are You Missing?

    Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

    Join & Write a Comment

    Suggested Solutions

    Title # Comments Views Activity
    Write an app 9 38
    Strip HTML Entities and convert <br> Entity to \n 4 21
    How to represnt 52 selections 1 23
    query question 4 20
    Author Note: Since this E-E article was originally written, years ago, formal testing has come into common use in the world of PHP.  PHPUnit ( and similar technologies have enjoyed wide adoption, making it possib…
    Foreword (July, 2015) Since I first wrote this article, years ago, a great many more people have begun using the internet.  They are coming online from every part of the globe, learning, reading, shopping and spending money at an ever-increasing ra…
    Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
    The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

    754 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    20 Experts available now in Live!

    Get 1:1 Help Now