- Community Pick
- Experts Exchange Approved
- Editor's Choice
This Article discusses tips for extending the functionality of the VB 6.0 and VBA Split() function, described here. We will introduce a quick and easy enhancement function, as well as a more extensive function which provides some much needed missing features. Both of these routines only provide single-character-delimiter
Sections:
- Problem
- Multiple Single-Character Delimiter Parsing Using the Native Replace() and Split() Functions
- An Extended Custom Split Function
- Relative Performances
- References
- 1
Problem
The built in VBA Split function parses a string using only a single delimiter, such as a space (" "), or a word ("and"). This allows you to split a string like in this:
which gives you the array: ("Welcome","to","VBA").
But what if you want to split a string like "Year/Month/Day Hours:Minutes:Seconds.Mill
and get the result: ("Year", "Month", "Day", "Hours", "Minutes", "Seconds", "Milliseconds")
If you want to perform this type of parsing, this article is for you!
If this is something you want to do frequently, or efficiently, then you should look at including these custom split functions in your VBA projects.
NOTE: There are other limitations to the native Split() function. One might want to treat consecutive delimiters as one, similar to some data importing actions. Also, the Split() doesn't have the ability to parse based on character patterns. If you face a complex parsing task, you will probably need to use regular expression parsing. Read this article (http:/A_1336.html) for a RegEx introduction and many VBA examples of RegEx parsing.
- 2
Multiple Single-Character Delimiter Parsing Using the Native Replace() and Split() Functions
The following code is quite simple. The concept here is that instead of writing out own split function, we sequentially replace all the single-character delimiters and then run the Split() function. Although this can be done several different ways, the most efficient is to invoke the Replace() method M-1 times, where M is the number of single-character delimiters. We invoke the Split() function on the Replace-modified string with the remaining delimiter. This function lacks the additional functionality of the custom Split routine in part 3, but makes for a quick solution to a need for simple multiple single-character delimiter parsing.
- 3
An Extended Custom Split Function
When I wrote this article, I wrote a full-featured, efficient Split function alternative. This function accepts a parameter for a string of delimiters, implements the Limit parameter of the native Split() function, and adds a parameter that allows the suppression of empty strings caused by consecutive delimiters.
The function specification is:
Differences between Split() and SplitMultiDelims()
* Unlike the Split() function, it explicitly returns a String(), not a Variant/String().
* In its most basic form - SplitMultiDelims(Text, " ") - the functionality is identical to Split(Text), except that when Text is empty, it returns an Array with a single empty string element, instead of an unallocated array.
* The Delimiter parameter is mandatory, instead of optional with a default value of " ".
* The native VBA Split() function can delimit a multi-character string, such as "and" or "END". This function only uses single character delimiters. For discussion of using multiple string delimiters, see the references section.
Text specifies the string to be split. Text is not modified by this routine.
DelimChars is a list of single character delimiters. If any of these are encountered, the string will be split at that character. A Delimiter denotes the end of one array element and the start of the next one. Delimiters will never appear in any of the elements of the String array as they are excluded. DelimChars is never modified.
IgnoreConsecutiveDelimite
Limit[Default = -1] Same as in Split() - Limit specifies the maximum number of array elements that will be generated by Split(). If Limit is reached, the (Limit)th array element will contain what remains of the un-split Text - delimiters and all. Note - When Limit is any value less than 1, it will be treated as though it were -1 (i.e. ignored). For the native Split() function, however, a Limit of 0 yields an uninitialized array, and a Limit of less than -1 yields a runtime error.
Note that the Split() parameter Compare is not included. Normally, setting Compare to Text Compare instead of Binary will ignore the case when the delimiter is a letter (i.e. "a" will cause 'a' or 'A' to be treated as a delimiter) This functionality, if needed, can be worked-around by simply including both 'a' and 'A' in the list of DelimChars (i.e. "aA").
Sample Use Cases:
Here is the code for the function:
- 4
Relative Performances
In case you're curious about relative performances, here's a look at function run times (and how you probably don't have to worry about them). I ran 50,000 loops of the regular Split function, as well as the two custom Split functions. Here are the results under various conditions:
Running 50,000 loops, splitting the string "Hello my name is Alain a!b@c#d$e%f^g&h*u(j"
Test1a: Split (space delim): 234 ms
Test1b: ReplaceAndSplit (space delim): 328 ms
Test1c: SplitMultiDelims (space delim): 1563 ms
The result from all three runs was:
("Hello", "", "", "", "", "", "my", "name", "is", "Alain", "a!b@c#d$e%f^g&h*u(j").
For this test, I ran the two custom functions again, but with 10 single-character delimiters specified.
In one case, the delimiters were " 123456789" (of which only the space is used).
In the second case, the delimiters are " !@#$%^&*(" (all of which are used in the string)
Test2a: ReplaceAndSplit (10 delims, 1 used): 532 ms
Test2b: SplitMultiDelims (10 delims, 1 used): 1562 ms
The result from all three runs was:
("Hello", "", "", "", "", "", "my", "name", "is", "Alain", "a!b@c#d$e%f^g&h*u(j").
Test3a: ReplaceAndSplit (10 delims, 10 used): 1765 ms
Test3b: SplitMultiDelims (10 delims, 10 used): 1953 ms
The result from all three runs was:
("Hello", "", "", "", "", "", "my", "name", "is", "Alain", "a", "b", "c", "d", "e", "f", "g", "h", "u", "j").
So the runtime is quite good for the custom procedures.
- 5
References
A big thanks goes out to aikimark for his code contributions, optimizations, and suggestions. This article was a joint effort.
http://msdn.microsoft.com/en-us/libr ary/ 6x627e 5f%28VS.80 %29.aspx - Split Function
http://msdn.microsoft.com/en-us/libr ary/ aa1557 63%28offic e.10%29.as px - Enhanced uses of the Split function.
http://www.aivosto.com/vbtips/string opt.html - An introduction to Text Parsing and Tips for improving VBA efficiency (especially during String operations)
http://www.cpearson.com/excel/splito ndelimiter s.aspx - Shows two functions, the first is similar to SplitMultiDelims. The second function, SplitMultiDelimsEX, demonstrates how to parse with multiple multi-character delimiters.
Hope this is as useful to you as it is to me. Cheers!
--
Alain Bryden
by: aikimark on 2009-10-05 at 14:32:05ID: 3971
I have added a follow-on article to this one:
e.com/Prog ramming/La nguages/ Vi sual_Basic /A_1679-Pa ssing-list s-and-comp lex-data-t hrough-a- p arameter.h tml
http://www.experts-exchang
The new article is more about passing complex parameters, but it uses this parsing problem as its initial example.