VB/VBA code for fuzzy search (to find similar text strings regardless of alternate spellings, etc.)

Posted on 2005-04-20
Last Modified: 2011-10-03
I'm looking for a snippet of VBA code that can be used in Access to perform a search on a set of strings such that:

If "applesauce road" is sought,
"aplesauce road", "applesc rd", etc.... will be found.  (And vice versa?)

In other words, a fuzzy search that will rank similar text strings by content and sequence of characters, to enable misspellings and alternate spellings to be considered as potential matches.

As far as I'm concerned, the code can be independent of Access so long as it is in some form of VB/BASIC and doesn't rely on functions to which I don't have to speak...  I'll migrate it so that it works for my application.

Any pointers?
Question by:psk1
    LVL 10

    Expert Comment

    Unless you can come up with some rules, you will need a set of dictionary words and their abbreviations/misspellings. If you entered Road it would find rd and rowed and rode. If you entered rode, it would find all the other words. This database will be hard to generate yourself so you may want to look at an existing dictionary that you can purchase. If the text is in a limited field (eg. only street addresses) then the list may not be too long.

    If you could come up with some rules (eg. try with and without double L, replace OA with OW etc) then you would just perform a replace on the entered text using the rules and then test it in your search. You may even come up with rules for cutting the words down and then matching them. But you must be careful that you don't abbreviate Redrod to rd at the same time you're abbreviating Road to rd.

    Does any of this help?
    LVL 16

    Assisted Solution

    Soundex may be what you're looking for.   Try for a complete treatise on it.
    LVL 2

    Author Comment

    Slamhound, I understand your suggestion but it's not what I want to do...

    Kmslogic, that Soundex page has code on it for VB as well as other languages.  That's great.  I'll certainly use that.

    But more than searching for "sounds like", I want to identify "looks like" ... 'automatic' might closely match 'automated' and 'automotive' to varying degrees.  Soundex helps with a lot but not variations on words like this.  So I'm looking for code that will run through each letter in the searcher and the searchee and gauge its likelihood to be a similar word (or set of words) based on the count of correct letters and their closeness in order and position to the first word.
    LVL 16

    Expert Comment

    by:Chuck Wood
    Are you wanting to search a data field or a text string?

    LVL 2

    Author Comment

    Multiple text strings.  I'll be pulling them from a table though, so I could search the data fields instead.  What might be the difference?  I can pull the data into the code either way, as a recordset.field or as an independent string or string array.
    LVL 16

    Accepted Solution

    It really doesn't matter. I just wanted to know so I could write some suggested example code. The following is a start that finds the longest word part of the search string in the target string. I am not sure exactly what you would want done from there.

        ' assume the text to search for is in strSearch
        ' and the text to search in is in strTarget
        Dim strSearch As String, strTarget As String
        strSearch = "applesauce road"
        strTarget = "applsc rd applesauce road"
        ' first break the search string into an array on spaces
        Dim astrSearch() As String, astrTemp() As String
        astrTemp = Split(strSearch, " ")
        ' create an array with three elements
        ReDim astrSearch(2, UBound(astrTemp, 1))
        ' copy the temp array values into the first element of the search array
        Dim intRow As Integer
        For intRow = 0 To UBound(astrTemp, 1)
            astrSearch(0, intRow) = astrTemp(intRow)
            ' set the other two elements to zero
            astrSearch(1, intRow) = 0
            astrSearch(2, intRow) = 0
        Next intRow
        ' minimize the temp array
        ReDim astrTemp(0)
        ' loop through the array
        Dim intWord As Integer, intLong As Integer
        Dim intCount As Integer, intStart As Integer
        For intWord = 0 To UBound(astrSearch, 2)
            ' search for successively shorter parts of the word
            For intLong = 1 To Len(astrSearch(0, intWord))
                ' count the occurances of the part of the work
                intCount = 0
                intStart = 1
                Do While InStr(intStart, strTarget, Left$(astrSearch(0, intWord), intLong)) > 0
                    ' count this occurance
                    intCount = intCount + 1
                    ' start again after this occurance
                    intStart = InStr(intStart, strTarget, Left$(astrSearch(0, intWord), intLong)) + 1
                ' if the length is larger than a previous length
                If intLong > astrSearch(2, intWord) Then
                    astrSearch(1, intWord) = intCount ' count
                    astrSearch(2, intWord) = intLong ' length of word part
                End If
            Next intLong
        Next intWord
        ' at the end, you have an array with:
        '    element 0    element 1    element 2
        '    word         count of     length of
        '                 occurances   word part
        Dim intCol As Integer, strPrint As String
        For intRow = 0 To UBound(astrSearch, 2)
            strPrint = ""
            For intCol = 0 To UBound(astrSearch, 1)
                strPrint = strPrint & astrSearch(intCol, intRow) & " - "
            Next intCol
            Debug.Print strPrint
        Next intRow



    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Join & Write a Comment

    This article is a continuation or rather an extension from Cascading Combos ( and builds on examples developed in detail there. It should be understandable alone, but I recommend reading the previous artic…
    Introduction The Visual Basic for Applications (VBA) language is at the heart of every application that you write. It is your key to taking Access beyond the world of wizards into a world where anything is possible. This article introduces you to…
    Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…
    In Microsoft Access, learn how to “cascade” or have the displayed data of one combo control depend upon what’s entered in another. Base the dependent combo on a query for its row source: Add a reference to the first combo on the form as criteria i…

    746 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    16 Experts available now in Live!

    Get 1:1 Help Now