Link to home
Start Free TrialLog in
Avatar of rutledgj
rutledgj

asked on

vb.net split function

I need to split some input string based on a word, not a character. Is there any way to do this in vb?

Dim MyString = "This is a test <NEW RECORD> for which I need an answer"
dim result = MyString.Split("<NEW RECORD>",StringSplitOptions.RemoveEmptyEntries)

This doesn't work.
Also, I may have multiple records with this word separating them <NEW RECORD>

Avatar of plusone3055
plusone3055
Flag of United States of America image

there are several ways to to this :)

http://www.dotnetperls.com/split-vbnet
from
http://www.dotnetperls.com/split-vbnet

Split based on words
Often you need to extract the words from a String or sentence in VB.NET. The code here needs to handle punctuation and non-word characters differently than the String Split method. Here we use Regex.Split to parse the words.

Program that splits words [VB.NET]

Imports System.Text.RegularExpressions

Module Module1

    Sub Main()
      ' Declare iteration variable
      Dim s As String

      ' Loop through words in string
      Dim arr As String() = SplitWords("That is a cute cat, man!")

      ' Display each word. Note that punctuation is handled correctly.
      For Each s In arr
          Console.WriteLine(s)
      Next
      Console.ReadLine()
    End Sub

    ''' <summary>
    ''' Split the words in string on non-word characters.
    ''' This means commas and periods are handled correctly.
    ''' </summary>
    Private Function SplitWords(ByVal s As String) As String()
      '
      ' Call Regex.Split function from the imported namespace.
      ' Return the result array.
      '
      Return Regex.Split(s, "\W+")
    End Function

End Module

Output

That
is
a
cute
cat
man
Avatar of kaufmed
While I agree that a regex-based split can be used here, the pattern you would actually want to use would be your string, not "\w+", based on the phrasing of your question. In plusone3055's example, change the line:

Return Regex.Split(s, "\W+")

Open in new window


to:

Return Regex.Split(s, "<NEW RECORD>")

Open in new window

Avatar of rutledgj
rutledgj

ASKER

These examples still seem to just be splitting based on a space or punctuation.  I have a text file that contains a bunch of records, each record starts off with the <NEW RECORD> tag. So I need to read in all the text after each new record tag into an array.  
...even with the change I proposed above?
I tried this:

Dim SplitData() As String
 SplitData = Regex.Split(TextIn, "<NEW RECORD>")

It didn't split it. It put the entire contents into SplitData(0)
This is what I get:
untitled.PNG
Here is a sample input if it helps
sampledoc.txt
P.S.

Remember that regex is case-sensitive by default. "<NEW RECORD>" is not the same as "<new record>". You can change this behavior if necessary.
ASKER CERTIFIED SOLUTION
Avatar of kaufmed
kaufmed
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
What does (?i) do? It worked.
What does (?i) do? It worked.
Makes the pattern following it case-insensitive. You can alternatively do:

Dim result() As String = Regex.Split(MyString, "<NEW RECORD>", RegexOptions.IgnoreCase)

Open in new window


but my preference is the more succinct version. Use the version that makes the most sense to you  = )