<

How to Split a String with Multiple Delimiters in VBA

Published on
103,227 Points
92,227 Views
15 Endorsements
Last Modified:
Awarded
Editor's Choice
Community Pick
This Article
This Article discusses tips for extending the functionality of the VB 6.0 and VBA Split() function, described here. We will introduce a quick and easy enhancement function, as well as a more extensive function which provides some much needed missing features. Both of these routines only provide single-character-delimiter parsing. At the end of this article I include links to articles which show extended Split functions that surmount some SPLIT() limitations.

Sections:
Problem
Multiple Single-Character Delimiter Parsing Using the Native Replace() and Split() Functions
An Extended Custom Split Function
Relative Performances
References

1. Problem


The built in VBA Split function parses a string using only a single delimiter, such as a space (" "), or a word ("and"). This allows you to split a string like in this:
Split("Welcome to VBA", " ")

Open in new window

which gives you the array: ("Welcome","to","VBA").

But what if you want to split a string like "Year/Month/Day Hours:Minutes:Seconds.Milliseconds"? Wouldn't it be nice if you could call:
Split("Year/Month/Day Hours:Minutes:Seconds.Milliseconds", "/ :.")

Open in new window

and get the result:  ("Year", "Month", "Day", "Hours", "Minutes", "Seconds", "Milliseconds")

If you want to perform this type of parsing, this article is for you!

If this is something you want to do frequently, or efficiently, then you should look at including these custom split functions in your VBA projects.

NOTE: There are other limitations to the native Split() function. One might want to treat consecutive delimiters as one, similar to some data importing actions. Also, the Split() doesn't have the ability to parse based on character patterns. If you face a complex parsing task, you will probably need to use regular expression parsing.  Read this article (http:/A_1336.html) for a RegEx introduction and many VBA examples of RegEx parsing.

2. Multiple Single-Character Delimiter Parsing Using the Native Replace() and Split() Functions


The following code is quite simple. The concept here is that instead of writing out own split function, we sequentially replace all the single-character delimiters and then run the Split() function. Although this can be done several different ways, the most efficient is to invoke the Replace() method M-1 times, where M is the number of single-character delimiters. We invoke the Split() function on the Replace-modified string with the remaining delimiter. This function lacks the additional functionality of the custom Split routine in part 3, but makes for a quick solution to a need for simple multiple single-character delimiter parsing.
'=======================================================
'ReplaceAndSplit by alainbryden, optimized by aikimark
'Uses the native REPLACE() function to replace all delimiters with a common
'delimiter, and then splits them based on that.
'=======================================================
Function ReplaceAndSplit(ByRef Text As String, ByRef DelimChars As String) As String()
    Dim DelimLen As Long, Delim As Long
    Dim strTemp As String, Delim1 As String, Arr() As String, ThisDelim As String
    strTemp = Text
    Delim1 = Left$(DelimChars, 1)
    DelimLen = Len(DelimChars)
    For Delim = 2 To DelimLen
        ThisDelim = Mid$(DelimChars, Delim, 1)
        If InStr(strTemp, ThisDelim) <> 0 Then _
            strTemp = Replace(strTemp, ThisDelim, Delim1)
    Next
    ReplaceAndSplit = Split(strTemp, Delim1)
End Function

Open in new window



3. An Extended Custom Split Function


When I wrote this article, I wrote a full-featured, efficient Split function alternative.  This function accepts a parameter for a string of delimiters, implements the Limit parameter of the native Split() function, and adds a parameter that allows the suppression of empty strings caused by consecutive delimiters.

The function specification is:
SplitMultiDelims(ByRef Text As String, ByRef DelimChars As String, 
                 Optional ByVal IgnoreConsecutiveDelimiters As Boolean = False, 
                 Optional ByVal Limit As Long = -1) As String()

Open in new window

Differences between Split() and SplitMultiDelims()
*  Unlike the Split() function, it explicitly returns a String(), not a Variant/String().
*  In its most basic form - SplitMultiDelims(Text, " ") - the functionality is identical to Split(Text), except that when Text is empty, it returns an Array with a single empty string element, instead of an unallocated array.
*  The Delimiter parameter is mandatory, instead of optional with a default value of " ".
*  The native VBA Split() function can delimit a multi-character string, such as "and" or "END". This function only uses single character delimiters. For discussion of using multiple string delimiters, see the references section.

Text specifies the string to be split. Text is not modified by this routine.

DelimChars is a list of single character delimiters. If any of these are encountered, the string will be split at that character. A Delimiter denotes the end of one array element and the start of the next one. Delimiters will never appear in any of the elements of the String array as they are excluded. DelimChars is never modified.

IgnoreConsecutiveDelimiters [Default = False] Will, if enabled, treat consecutive delimiters as one. Even if the delimiters are different, they will result in a single split in Text until a non-delimiting character is encountered. This means that none of the array elements will ever be empty unless the string itself is empty. (Special case addressed: If the array is terminated by a delimiter, then the EOS (end of string) is interpreted as a consecutive delimiter, so there will still be no empty string element generated)

Limit[Default = -1] Same as in Split() - Limit specifies the maximum number of array elements that will be generated by Split(). If Limit is reached, the (Limit)th array element will contain what remains of the un-split Text - delimiters and all. Note - When Limit is any value less than 1, it will be treated as though it were -1 (i.e. ignored). For the native Split() function, however, a Limit of 0 yields an uninitialized array, and a Limit of less than -1 yields a runtime error.

Note that the Split() parameter Compare is not included. Normally, setting Compare to Text Compare instead of Binary will ignore the case when the delimiter is a letter (i.e. "a" will cause 'a' or 'A' to be treated as a delimiter) This functionality, if needed, can be worked-around by simply including both 'a' and 'A' in the list of DelimChars (i.e. "aA").

Sample Use Cases:
Dim str As String, strArr() as String
str = "Hello, my friends. Hello,,,,Hello,Hello"

strArr = SplitMultiDelims(str, " ")
'strArr = ("Hello,", "my", "friends.", "Hello,,,,Hello,Hello")

strArr = SplitMultiDelims(str, " ,")
'strArr = ("Hello", "", "my", "friends.", "Hello", "", "", "", "Hello", "Hello")

strArr = SplitMultiDelims(str, " ,", true)
'strArr = ("Hello", "my", "friends.", "Hello", "Hello", "Hello")

strArr = SplitMultiDelims(str, " ,", true, 3)
'strArr = ("Hello", "my", "friends. Hello,,,,Hello,Hello")

Open in new window

Here is the code for the function:
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
' SplitMultiDelims by alainbryden
' This function splits Text into an array of substrings, each substring
' delimited by any character in DelimChars. Only a single character
' may be a delimiter between two substrings, but DelimChars may
' contain any number of delimiter characters. It returns a single element
' array containing all of text if DelimChars is empty, or a 1 or greater
' element array if the Text is successfully split into substrings.
' If IgnoreConsecutiveDelimiters is true, empty array elements will not occur.
' If Limit greater than 0, the function will only split Text into 'Limit'
' array elements or less. The last element will contain the rest of Text.
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Function SplitMultiDelims(ByRef Text As String, ByRef DelimChars As String, _
        Optional ByVal IgnoreConsecutiveDelimiters As Boolean = False, _
        Optional ByVal Limit As Long = -1) As String()
    Dim ElemStart As Long, N As Long, M As Long, Elements As Long
    Dim lDelims As Long, lText As Long
    Dim Arr() As String
    
    lText = Len(Text)
    lDelims = Len(DelimChars)
    If lDelims = 0 Or lText = 0 Or Limit = 1 Then
        ReDim Arr(0 To 0)
        Arr(0) = Text
        SplitMultiDelims = Arr
        Exit Function
    End If
    ReDim Arr(0 To IIf(Limit = -1, lText - 1, Limit))
    
    Elements = 0: ElemStart = 1
    For N = 1 To lText
        If InStr(DelimChars, Mid(Text, N, 1)) Then
            Arr(Elements) = Mid(Text, ElemStart, N - ElemStart)
            If IgnoreConsecutiveDelimiters Then
                If Len(Arr(Elements)) > 0 Then Elements = Elements + 1
            Else
                Elements = Elements + 1
            End If
            ElemStart = N + 1
            If Elements + 1 = Limit Then Exit For
        End If
    Next N
    'Get the last token terminated by the end of the string into the array
    If ElemStart <= lText Then Arr(Elements) = Mid(Text, ElemStart)
    'Since the end of string counts as the terminating delimiter, if the last character
    'was also a delimiter, we treat the two as consecutive, and so ignore the last elemnent
    If IgnoreConsecutiveDelimiters Then If Len(Arr(Elements)) = 0 Then Elements = Elements - 1
    
    ReDim Preserve Arr(0 To Elements) 'Chop off unused array elements
    SplitMultiDelims = Arr
End Function

Open in new window

4. Relative Performances


In case you're curious about relative performances, here's a look at function run times (and how you probably don't have to worry about them). I ran 50,000 loops of the regular Split function, as well as the two custom Split functions. Here are the results under various conditions:
Running 50,000 loops, splitting the string "Hello     my name is Alain a!b@c#d$e%f^g&h*u(j"
Test1a: Split (space delim): 234 ms
Test1b: ReplaceAndSplit (space delim): 328 ms
Test1c: SplitMultiDelims (space delim): 1563 ms
The result from all three runs was:
   ("Hello", "", "", "", "", "", "my", "name", "is", "Alain", "a!b@c#d$e%f^g&h*u(j").

For this test, I ran the two custom functions again, but with 10 single-character delimiters specified.
In one case, the delimiters were " 123456789" (of which only the space is used).
In the second case, the delimiters are " !@#$%^&*(" (all of which are used in the string)
Test2a: ReplaceAndSplit (10 delims, 1 used): 532 ms
Test2b: SplitMultiDelims (10 delims, 1 used): 1562 ms
The result from all three runs was:
   ("Hello", "", "", "", "", "", "my", "name", "is", "Alain", "a!b@c#d$e%f^g&h*u(j").

Test3a: ReplaceAndSplit (10 delims, 10 used): 1765 ms
Test3b: SplitMultiDelims (10 delims, 10 used): 1953 ms
The result from all three runs was:
 ("Hello", "", "", "", "", "", "my", "name", "is", "Alain", "a", "b", "c", "d", "e", "f", "g", "h", "u", "j").

So the runtime is quite good for the custom procedures.

5. References


A big thanks goes out to aikimark for his code contributions, optimizations, and suggestions. This article was a joint effort.

http://msdn.microsoft.com/en-us/library/6x627e5f%28VS.80%29.aspx - Split Function
http://msdn.microsoft.com/en-us/library/aa155763%28office.10%29.aspx - Enhanced uses of the Split function.
http://www.aivosto.com/vbtips/stringopt.html - An introduction to Text Parsing and Tips for improving VBA efficiency (especially during String operations)
http://www.cpearson.com/excel/splitondelimiters.aspx - Shows two functions, the first is similar to SplitMultiDelims.  The second function, SplitMultiDelimsEX, demonstrates how to parse with multiple multi-character delimiters.

Hope this is as useful to you as it is to me. Cheers!

--
Alain Bryden
15
Comment
Author:alainbryden
5 Comments
 
LVL 47

Expert Comment

by:aikimark
I have added a follow-on article to this one:

http://www.experts-exchange.com/Programming/Languages/Visual_Basic/A_1679-Passing-lists-and-complex-data-through-a-parameter.html

The new article is more about passing complex parameters, but it uses this parsing problem as its initial example.
0
 

Expert Comment

by:DanielPDX
Just wanted to say this function really saved the day. Nice job and thank you!
0
 

Expert Comment

by:Rayne
Allright, I admit...

 Alain - You SAVED my life today.
All Hats of to you. This is miracle, cant explain :)
Thanks a MILLION Times
0
 

Expert Comment

by:Robin Hood
Great Job!

If anyone needs to modify the function to INCLUDE the delimiter character in the split one only needs to change

ElemStart = N + 1  

to

ElemStart = N
0
 

Expert Comment

by:James Dent
'Uses the native REPLACE() function to replace all delimiters with a common delimiter, and then splits them based on that.'

Good idea that!
0

Featured Post

Cloud Class® Course: SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

Join & Write a Comment

In this video you will find out how to export Office 365 mailboxes using the built in eDiscovery tool. Bear in mind that although this method might be useful in some cases, using PST files as Office 365 backup is troublesome in a long run (more on t…
The Relationships Diagram is a good way to get an overall view of what a database is keeping track of. It is also where relationships are defined. A relationship specifies how two tables connect to each other. As you build tables in Microsoft Ac…

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month