Parse text field into multiple values using multiple delimiters

Dumb_Blonde
Dumb_Blonde used Ask the Experts™
on
I'm trying to split a long text field into multiple values using multiple delimiters. I found what I thought was the perfect solution here on EE posted by Alain Bryden (see below). I copied the code into a module in my DB. Using the function as written below was causing an Undefined function error. At the suggestion of an Expert, I made a few changes (in bold below) and was able to execute the function. The result returned is the value before the first delimiter. I was expecting it to return each delimited value in the string.

I'm just a business anlayst who's gotten stuck with trying to figure out how to do something using Access and VBA. I know I'm working with an array here and am wondering if I should be using something like GetRows. Unfortunately, I haven't the foggiest idea how to use it.

I'm at a loss here and rapidly running up against a deadline. Any help or suggestions you can offer would be much appreciated.

Function SplitMultiDelims(ByRef Text As String, ByRef DelimChars As String, _
        Optional ByVal IgnoreConsecutiveDelimiters As Boolean = False, _
        Optional ByVal Limit As Long = -1) As String()

Public Function SplitMultiDelims(ByRef Text As String, ByRef DelimChars As String, _
       Optional ByVal IgnoreConsecutiveDelimiters As Boolean = False, _
       Optional ByVal Limit As Long = -1) As String

    Dim ElemStart As Long, N As Long, M As Long, Elements As Long
    Dim lDelims As Long, lText As Long
    Dim Arr() As String
   
    lText = Len(Text)
    lDelims = Len(DelimChars)
    If lDelims = 0 Or lText = 0 Or Limit = 1 Then
        ReDim Arr(0 To 0)
        Arr(0) = Text
        SplitMultiDelims = Arr SplitMultiDelims = Arr(0)
        Exit Function
    End If
    ReDim Arr(0 To IIf(Limit = -1, lText - 1, Limit))
   
    Elements = 0: ElemStart = 1
    For N = 1 To lText
        If InStr(DelimChars, Mid(Text, N, 1)) Then
            Arr(Elements) = Mid(Text, ElemStart, N - ElemStart)
            If IgnoreConsecutiveDelimiters Then
                If Len(Arr(Elements)) > 0 Then Elements = Elements + 1
            Else
                Elements = Elements + 1
            End If
            ElemStart = N + 1
            If Elements + 1 = Limit Then Exit For
        End If
    Next N
    'Get the last token terminated by the end of the string into the array
    If ElemStart <= lText Then Arr(Elements) = Mid(Text, ElemStart)
    'Since the end of string counts as the terminating delimiter, if the last character
    'was also a delimiter, we treat the two as consecutive, and so ignore the last elemnent
    If IgnoreConsecutiveDelimiters Then If Len(Arr(Elements)) = 0 Then Elements = Elements - 1
   
    ReDim Preserve Arr(0 To Elements) 'Chop off unused array elements
    SplitMultiDelims = Arr SplitMultiDelims = Arr(0)
End Function
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Commented:
Perhaps you can post the code that you are using to CALL the above function and perhaps some test data and what you expect the function to return.
Top Expert 2010

Commented:
Dumb_Blonde,

Could you post a few rows of example data, along with the result you would expect based on that sample?

Patrick

Author

Commented:
I don't have any code that I'm calling this from. I used the function in a query expression. It's now obvious to me that it won't work using it in an expression in the query designer but I'm at a loss as to how to write the VBA code to do this. I'm guessing it requires some kind of a loop? In any event I have a column in the table called Narrative. The values in that table look like this:

"HO2, 2209/2210/2689/2690/2728/2729/2852/2947/3159/3160/3417/3431/3521/3522, Embedded, Cal Yr, CO/TO Yes"
I need the data parsed (using " ", "," and "/" as delimiters) into "HO2", "2209", "2210", "2689", "2690" etc. Ideally, each value in a seperate row but if I could get it parsed into columns in the same row I can work with it.

Thanks much.
Success in ‘20 With a Profitable Pricing Strategy

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Commented:
The Split VBA code (without the Bolded modifications) works fine with your test data. The problem is really how you want to use the resulting data.

You mentioned that you would want the data in a separate row - you would need a new table for this, say:

ID - Autnumber
SourceRowID - some unique value of your original data row
Element - int, that represents the position of the parsed element in your array
Data - string, the actual data

You can then call your Split function as follows

Dim str As String, strArr() As String
Dim i As int
str = "HO2, 2209/2210/2689/2690/2728/2729/2852/2947/3159/3160/3417/3431/3521/3522, Embedded, Cal Yr, CO/TO Yes"
strArr = SplitMultiDelims(str, " ,/", True)
For i = 0 to UBound(strArr)
   Debug.Print i, strArr(i)
   'DoCmd.ExecuteSQL("INSERT INTO tblNew (fileds) VALUES (key, i, strArr(i))
Next i

Author

Commented:
I'm getting a compile error (Cannot assign to array) on this line:

strArr = SplitMultiDelims(str, " ,/", True)
Top Expert 2010
Commented:
This approach seems to be working for me; it returns the data in a new table, Results.

It assumes the source table is named SomeTable and the destination is named Results.  These can be modified by changing the constant values for SourceTableName and DestTableName.

It uses Regular Expressions to parse the data; please see my article here for more info:

http://www.experts-exchange.com/Programming/Languages/Visual_Basic/A_1336-Using-Regular-Expressions-in-Visual-Basic-for-Applications-and-Visual-Basic-6.html


Sub MakeTheTable()
    
    Dim rsIn As DAO.Recordset, rsOut As DAO.Recordset
    Dim arr As Variant
    Dim td As DAO.TableDef
    Dim f As DAO.Field
    Dim ItemNum As Long
    Dim Narrative As String
    Dim Counter As Long
    
    Const SourceTableName As String = "SomeTable"
    Const DestTableName As String = "Results"
    
    With CurrentDb
        On Error Resume Next
        .Execute "DROP TABLE [" & DestTableName & "]", dbFailOnError
        On Error GoTo 0
        Set td = .CreateTableDef(DestTableName)
        Set f = td.CreateField("NewID", dbLong)
        f.Attributes = dbAutoIncrField + dbFixedField
        td.Fields.Append f
        Set f = td.CreateField("OldID", dbLong)
        td.Fields.Append f
        Set f = td.CreateField("ItemNum", dbLong)
        td.Fields.Append f
        Set f = td.CreateField("ItemValue", dbText, 50)
        f.Required = False
        f.AllowZeroLength = True
        td.Fields.Append f
        .TableDefs.Append td
        .TableDefs.Refresh
    End With
    
    Set f = Nothing
    Set td = Nothing
    
    Set rsIn = CurrentDb.OpenRecordset("SELECT ID, Narrative FROM [" & SourceTableName & "]")
    Set rsOut = CurrentDb.OpenRecordset("SELECT NewID, OldID, ItemNum, ItemValue FROM [" & DestTableName & "]")
    
    Do Until rsIn.EOF
        ItemNum = 0
        Narrative = Nz(rsIn!Narrative, "")
        arr = RegExpFind(Narrative, "[^ ,/]+")
        If IsArray(arr) Then
            For Counter = LBound(arr) To UBound(arr)
                ItemNum = ItemNum + 1
                With rsOut
                    .AddNew
                    !OldID = rsIn!ID
                    !ItemNum = ItemNum
                    !ItemValue = arr(Counter)
                    .Update
                End With
            Next
        Else
            With rsOut
                .AddNew
                !OldID = rsIn!ID
                !ItemNum = 1
                !ItemValue = Narrative
                .Update
            End With
        End If
        rsIn.MoveNext
    Loop
    
    rsIn.Close
    Set rsIn = Nothing
    rsOut.Close
    Set rsOut = Nothing
    
End Sub

Function RegExpFind(LookIn As String, PatternStr As String, Optional Pos, _
    Optional MatchCase As Boolean = True, Optional ReturnType As Long = 0, _
    Optional MultiLine As Boolean = False)
    
    ' Function written by Patrick G. Matthews.  You may use and distribute this code freely,
    ' as long as you properly credit and attribute authorship and the URL of where you
    ' found the code
    
    ' For more info, please see:
    ' http://www.experts-exchange.com/articles/Programming/Languages/Visual_Basic/Using-Regular-Expressions-in-Visual-Basic-for-Applications-and-Visual-Basic-6.html
    
    ' This function relies on the VBScript version of Regular Expressions, and thus some of
    ' the functionality available in Perl and/or .Net may not be available.  The full extent
    ' of what functionality will be available on any given computer is based on which version
    ' of the VBScript runtime is installed on that computer
    
    ' This function uses Regular Expressions to parse a string (LookIn), and return matches to a
    ' pattern (PatternStr).  Use Pos to indicate which match you want:
    ' Pos omitted               : function returns a zero-based array of all matches
    ' Pos = 1                   : the first match
    ' Pos = 2                   : the second match
    ' Pos = <positive integer>  : the Nth match
    ' Pos = 0                   : the last match
    ' Pos = -1                  : the last match
    ' Pos = -2                  : the 2nd to last match
    ' Pos = <negative integer>  : the Nth to last match
    ' If Pos is non-numeric, or if the absolute value of Pos is greater than the number of
    ' matches, the function returns an empty string.  If no match is found, the function returns
    ' an empty string.  (Earlier versions of this code used zero for the last match; this is
    ' retained for backward compatibility)
    
    ' If MatchCase is omitted or True (default for RegExp) then the Pattern must match case (and
    ' thus you may have to use [a-zA-Z] instead of just [a-z] or [A-Z]).
    
    ' ReturnType indicates what information you want to return:
    ' ReturnType = 0            : the matched values
    ' ReturnType = 1            : the starting character positions for the matched values
    ' ReturnType = 2            : the lengths of the matched values
    
    ' If you use this function in Excel, you can use range references for any of the arguments.
    ' If you use this in Excel and return the full array, make sure to set up the formula as an
    ' array formula.  If you need the array formula to go down a column, use TRANSPOSE()
    
    ' Note: RegExp counts the character positions for the Match.FirstIndex property as starting
    ' at zero.  Since VB6 and VBA has strings starting at position 1, I have added one to make
    ' the character positions conform to VBA/VB6 expectations
    
    ' Normally as an object variable I would set the RegX variable to Nothing; however, in cases
    ' where a large number of calls to this function are made, making RegX a static variable that
    ' preserves its state in between calls significantly improves performance
    
    Static RegX As Object
    Dim TheMatches As Object
    Dim Answer()
    Dim Counter As Long
    
    ' Evaluate Pos.  If it is there, it must be numeric and converted to Long
    
    If Not IsMissing(Pos) Then
        If Not IsNumeric(Pos) Then
            RegExpFind = ""
            Exit Function
        Else
            Pos = CLng(Pos)
        End If
    End If
    
    ' Evaluate ReturnType
    
    If ReturnType < 0 Or ReturnType > 2 Then
        RegExpFind = ""
        Exit Function
    End If
    
    ' Create instance of RegExp object if needed, and set properties
    
    If RegX Is Nothing Then Set RegX = CreateObject("VBScript.RegExp")
    With RegX
        .Pattern = PatternStr
        .Global = True
        .IgnoreCase = Not MatchCase
        .MultiLine = MultiLine
    End With
        
    ' Test to see if there are any matches
    
    If RegX.Test(LookIn) Then
        
        ' Run RegExp to get the matches, which are returned as a zero-based collection
        
        Set TheMatches = RegX.Execute(LookIn)
        
        ' Test to see if Pos is negative, which indicates the user wants the Nth to last
        ' match.  If it is, then based on the number of matches convert Pos to a positive
        ' number, or zero for the last match
        
        If Not IsMissing(Pos) Then
            If Pos < 0 Then
                If Pos = -1 Then
                    Pos = 0
                Else
                    
                    ' If Abs(Pos) > number of matches, then the Nth to last match does not
                    ' exist.  Return a zero-length string
                    
                    If Abs(Pos) <= TheMatches.Count Then
                        Pos = TheMatches.Count + Pos + 1
                    Else
                        RegExpFind = ""
                        GoTo Cleanup
                    End If
                End If
            End If
        End If
        
        ' If Pos is missing, user wants array of all matches.  Build it and assign it as the
        ' function's return value
        
        If IsMissing(Pos) Then
            ReDim Answer(0 To TheMatches.Count - 1)
            For Counter = 0 To UBound(Answer)
                Select Case ReturnType
                    Case 0: Answer(Counter) = TheMatches(Counter)
                    Case 1: Answer(Counter) = TheMatches(Counter).FirstIndex + 1
                    Case 2: Answer(Counter) = TheMatches(Counter).Length
                End Select
            Next
            RegExpFind = Answer
        
        ' User wanted the Nth match (or last match, if Pos = 0).  Get the Nth value, if possible
        
        Else
            Select Case Pos
                Case 0                          ' Last match
                    Select Case ReturnType
                        Case 0: RegExpFind = TheMatches(TheMatches.Count - 1)
                        Case 1: RegExpFind = TheMatches(TheMatches.Count - 1).FirstIndex + 1
                        Case 2: RegExpFind = TheMatches(TheMatches.Count - 1).Length
                    End Select
                Case 1 To TheMatches.Count      ' Nth match
                    Select Case ReturnType
                        Case 0: RegExpFind = TheMatches(Pos - 1)
                        Case 1: RegExpFind = TheMatches(Pos - 1).FirstIndex + 1
                        Case 2: RegExpFind = TheMatches(Pos - 1).Length
                    End Select
                Case Else                       ' Invalid item number
                    RegExpFind = ""
            End Select
        End If
    
    ' If there are no matches, return empty string
    
    Else
        RegExpFind = ""
    End If
    
Cleanup:
    ' Release object variables
    
    Set TheMatches = Nothing
    
End Function

Open in new window

Author

Commented:
Patrick -- Thank you, thank you, thank youi! Not I was able to parse the data the way I needed but after spending some time with your article found that I could select just the values I really needed (4 digit numerics) by changing the pattern to "\d{4}".

Regards,
DB
Top Expert 2010

Commented:
Dumb_Blonde,

Glad to help!  If you have not already done so, I would really appreciate it if you could please return to my article
http://www.experts-exchange.com/Programming/Languages/Visual_Basic/A_1336-Using-Regular-Expressions-in-Visual-Basic-for-Applications-and-Visual-Basic-6.html
and click 'Yes' for the 'Was this helpful?' voting.

Patrick

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial