Regex Match Name with List Names-Excel VB

Posted on 2009-04-16
Last Modified: 2012-05-06

I'm just trying to solve how to use a regex pattern in vb to match a list of names in sheet 1 (Range A1:A5) with another list of names in sheet 2 (Range A1:A8) and if there is a matching name found from sheet 2 and enter this name to the corresponding matching name in sheet 1.

In sheet 1 I have:
Ford Hill
Jeremy Camp
Alyssa Cornell
Joe Smith
Andy Carmell

Then in sheet 2 I have:
Ford/Miranda Hill
Jeremy Kamp
Alyssa Cornel
Joe Smith
Andy Carmel
Anita Chow
John Goldstein
Rob Kim

The spelling is a bit different in sheet 2 by some characters or one.

I'm attaching an excel sheet. Any help is very appreciated.

Thank you,
Question by:Victor Kimura
    LVL 16

    Accepted Solution


    I don't think regex will help you much here.  What you want is either some kind of Soundex algorithm (which you could program into VBA) or even better is the following "fuzzy Vlookup" function which was written by a member of the forum.  Check out his "fuzzy vlookup" function here:

    LVL 92

    Assisted Solution

    by:Patrick Matthews

    Thanks so much for posting that link!  I actually tried a Soundex approach, but I was dissatisfied with the
    result so I didn't post it.  I'll have to play with the "fuzzy" functions :)


    LVL 16

    Expert Comment

    Hey Patrick

    You're welcome!  it is resource-hungry though :)


    Author Comment

    by:Victor Kimura

    How about if I had something more simpler such as a variance of one letter at the end of the first name and/or last name. For instance,
    Jack Lemmon - Jacks Lemmons
    Jack Lemmon - Jacks Lemmon
    Jack Lemmons - Jack Lemmons

    where left - right become a true.

    I think the fuzzy logic is too much and for the ones that don't match then I can simply view them and edit them myself since the lists will be fairly short. Most names are not significantly different

    I have a dual purpose to view the code: one so I can learn vb and the other is to use it for the office. =)

    Thank you for the reference to the fuzzy logic.

    Could you provide me with some simple code as in the above?

    LVL 92

    Expert Comment

    by:Patrick Matthews

    The attached workbook uses the fuzzy logic search that Richard linked to, plus my own Soundex-based
    approaches (which on reflection did better than I thought :)

    For each list, I tried to find matches using the following:

    1) Plain vanilla VLOOKUP
    2) Soundex based on whole string
    3) "Soundex 2", based on concatenated Soundex values of first word and last word (uses Regular Expressions
    to find the words)
    4) Three different "fuzzy VLOOKUP" searches, using algorithms 1, 2, and 3 for the FuzzyVLookup function in
    that post

    Each worksheet shows the results of the search; I shaded red the cells where the searches either failed to find
    a match that was there, or returned a match that was false.

    The code snippet contains my code; the workbook includes that code plus the code from Richard's link.


    Option Explicit 
    Function Soundex(InputStr As String) As String 
    ' Found at
    ' Based on function developed by Richard J. Yanco
    ' This function follows the Soundex rules given at
        Dim Result As String, c As String * 1
        Dim Location As Long 
        InputStr = UCase(InputStr) 
    '   First character must be a letter
        If Asc(Left(InputStr, 1)) < 65 Or Asc(Left(InputStr, 1)) > 90 Then
            Soundex = ""
            Exit Function
    '       Convert to Soundex: letters to their appropriate digit,
    '                     A,E,I,O,U,Y ("slash letters") to slashes
    '                     H,W, and everything else to zero-length string 
            Result = Left(InputStr, 1)
            For Location = 2 To Len(InputStr)
                Result = Result & Category(Mid(InputStr, Location, 1))
            Next Location
    '       Remove double letters
            Location = 2
            Do While Location < Len(Result)
                If Mid(Result, Location, 1) = Mid(Result, Location + 1, 1) Then
                    Result = Left(Result, Location) & Mid(Result, Location + 2)
                    Location = Location + 1
                End If
    '       If category of 1st letter equals 2nd character, remove 2nd character
            If Category(Left(Result, 1)) = Mid(Result, 2, 1) Then
                Result = Left(Result, 1) & Mid(Result, 3)
            End If
    '       Remove slashes
            For Location = 2 To Len(Result)
                If Mid(Result, Location, 1) = "/" Then
                    Result = Left(Result, Location - 1) & Mid(Result, Location + 1)
                End If
    '       Trim or pad with zeroes as necessary
            Soundex = Left(Result & "0000", 4)
        End If
    End Function 
    Private Function Category(c) As String
    '   Returns a Soundex code for a letter
        Select Case True
            Case c Like "[AEIOUY]"
                Category = "/"
            Case c Like "[BPFV]"
                Category = "1"
            Case c Like "[CSKGJQXZ]"
                Category = "2"
            Case c Like "[DT]"
                Category = "3"
            Case c = "L"
                Category = "4"
            Case c Like "[MN]"
                Category = "5"
            Case c = "R"
                Category = "6"
            Case Else 'This includes vowels, Y, H and W, spaces, punctuation, etc.
                Category = ""
        End Select
    End Function 
    Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos, _
        Optional MatchCase As Boolean = True)
        ' Function by Patrick Matthews
        ' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
        ' pattern (PatternStr).  Use Pos to indicate which match you want:
        ' Pos omitted               : function returns a zero-based array of all matches
        ' Pos = 0                   : the last match
        ' Pos = 1                   : the first match
        ' Pos = 2                   : the second match
        ' Pos = <positive integer>  : the Nth match
        ' If Pos is greater than the number of matches, is negative, or is non-numeric, the function
        ' returns an empty string.  If no match is found, the function returns an empty string
        ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
        ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
        ' If you use this function in Excel, you can use range references for any of the arguments.
        ' If you use this in Excel and return the full array, make sure to set up the formula as an
        ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
        Dim RegX As Object
        Dim TheMatches As Object
        Dim Answer() As String
        Dim Counter As Long
        ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
        If Not IsMissing(Pos) Then
            If Not IsNumeric(Pos) Then
                RegExpFind = ""
                Exit Function
                Pos = CLng(Pos)
            End If
        End If
        ' Create instance of RegExp object
        Set RegX = CreateObject("VBScript.RegExp")
        With RegX
            .Pattern = PatternStr
            .Global = True
            .IgnoreCase = Not MatchCase
        End With
        ' Test to see if there are any matches
        If RegX.test(LookIn) Then
            ' Run RegExp to get the matches, which are returned as a zero-based collection
            Set TheMatches = RegX.Execute(LookIn)
            ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
            ' function's return value
            If IsMissing(Pos) Then
                ReDim Answer(0 To TheMatches.Count - 1) As String
                For Counter = 0 To UBound(Answer)
                    Answer(Counter) = TheMatches(Counter)
                RegExpFind = Answer
            ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
                Select Case Pos
                    Case 0                          ' Last match
                        RegExpFind = TheMatches(TheMatches.Count - 1)
                    Case 1 To TheMatches.Count      ' Nth match
                        RegExpFind = TheMatches(Pos - 1)
                    Case Else                       ' Invalid item number
                        RegExpFind = ""
                End Select
            End If
        ' If there are no matches, return empty string
            RegExpFind = ""
        End If
        ' Release object variables
        Set RegX = Nothing
        Set TheMatches = Nothing
    End Function

    Open in new window

    LVL 92

    Expert Comment

    by:Patrick Matthews

    No idea if I am using that FuzzyVLookup function optimally :)



    Author Closing Comment

    by:Victor Kimura
    Hi RichardSchollar/matthewspatrick:

    Thanks for you help. I haven't fully tested it but I will shortly. I've been just swamped at work. Thanks for the code, matthewspatrick.

    Featured Post

    6 Surprising Benefits of Threat Intelligence

    All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

    Join & Write a Comment

    Drop Down List with Unique/Distinct Values (Part II - ComboBox or ListBox and Data Validation List Bonus!) David Miller (dlmille) Intro This article focuses on delivering unique, sorted lists to list objects (e.g., ComboBox, ListBox) and Dat…
    Approximate matching with VLOOKUP and MATCH seems to me to be a greatly under-used technique, and one which is vital for getting good performance out of large lookups. Until recently I would always have advised using an exact match for simplicity an…
    The viewer will learn how to simulate a series of sales calls dependent on a single skill level and learn how to simulate a series of sales calls dependent on two skill levels. Simulating Independent Sales Calls: Enter .75 into cell C2 – “skill leve…
    This Micro Tutorial will demonstrate on a Mac how to change the sort order for chart legend values and decrpyt the intimidating chart menu.

    728 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now