Avatar of rutledgj
rutledgj

asked on 

vb.net split function

I need to split some input string based on a word, not a character. Is there any way to do this in vb?

Dim MyString = "This is a test <NEW RECORD> for which I need an answer"
dim result = MyString.Split("<NEW RECORD>",StringSplitOptions.RemoveEmptyEntries)

This doesn't work.
Also, I may have multiple records with this word separating them <NEW RECORD>

Visual Basic.NET

Avatar of undefined
Last Comment
kaufmed
Avatar of plusone3055
plusone3055
Flag of United States of America image

there are several ways to to this :)

http://www.dotnetperls.com/split-vbnet
Avatar of plusone3055
plusone3055
Flag of United States of America image

from
http://www.dotnetperls.com/split-vbnet

Split based on words
Often you need to extract the words from a String or sentence in VB.NET. The code here needs to handle punctuation and non-word characters differently than the String Split method. Here we use Regex.Split to parse the words.

Program that splits words [VB.NET]

Imports System.Text.RegularExpressions

Module Module1

    Sub Main()
      ' Declare iteration variable
      Dim s As String

      ' Loop through words in string
      Dim arr As String() = SplitWords("That is a cute cat, man!")

      ' Display each word. Note that punctuation is handled correctly.
      For Each s In arr
          Console.WriteLine(s)
      Next
      Console.ReadLine()
    End Sub

    ''' <summary>
    ''' Split the words in string on non-word characters.
    ''' This means commas and periods are handled correctly.
    ''' </summary>
    Private Function SplitWords(ByVal s As String) As String()
      '
      ' Call Regex.Split function from the imported namespace.
      ' Return the result array.
      '
      Return Regex.Split(s, "\W+")
    End Function

End Module

Output

That
is
a
cute
cat
man
Avatar of kaufmed
kaufmed
Flag of United States of America image

While I agree that a regex-based split can be used here, the pattern you would actually want to use would be your string, not "\w+", based on the phrasing of your question. In plusone3055's example, change the line:

Return Regex.Split(s, "\W+")

Open in new window


to:

Return Regex.Split(s, "<NEW RECORD>")

Open in new window

Avatar of rutledgj
rutledgj

ASKER

These examples still seem to just be splitting based on a space or punctuation.  I have a text file that contains a bunch of records, each record starts off with the <NEW RECORD> tag. So I need to read in all the text after each new record tag into an array.  
Avatar of kaufmed
kaufmed
Flag of United States of America image

...even with the change I proposed above?
Avatar of rutledgj
rutledgj

ASKER

I tried this:

Dim SplitData() As String
 SplitData = Regex.Split(TextIn, "<NEW RECORD>")

It didn't split it. It put the entire contents into SplitData(0)
Avatar of kaufmed
kaufmed
Flag of United States of America image

This is what I get:
untitled.PNG
Avatar of rutledgj
rutledgj

ASKER

Here is a sample input if it helps
sampledoc.txt
Avatar of kaufmed
kaufmed
Flag of United States of America image

P.S.

Remember that regex is case-sensitive by default. "<NEW RECORD>" is not the same as "<new record>". You can change this behavior if necessary.
ASKER CERTIFIED SOLUTION
Avatar of kaufmed
kaufmed
Flag of United States of America image

Blurred text
THIS SOLUTION IS ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
Avatar of rutledgj
rutledgj

ASKER

What does (?i) do? It worked.
Avatar of kaufmed
kaufmed
Flag of United States of America image

What does (?i) do? It worked.
Makes the pattern following it case-insensitive. You can alternatively do:

Dim result() As String = Regex.Split(MyString, "<NEW RECORD>", RegexOptions.IgnoreCase)

Open in new window


but my preference is the more succinct version. Use the version that makes the most sense to you  = )
Visual Basic.NET
Visual Basic.NET

Visual Basic .NET (VB.NET) is an object-oriented programming language implemented on the .NET framework, but also supported on other platforms such as Mono and Silverlight. Microsoft launched VB.NET as the successor to the Visual Basic language. Though it is similar in syntax to Visual Basic pre-2002, it is not the same technology,

96K
Questions
--
Followers
--
Top Experts
Get a personalized solution from industry experts
Ask the experts
Read over 600 more reviews

TRUSTED BY

IBM logoIntel logoMicrosoft logoUbisoft logoSAP logo
Qualcomm logoCitrix Systems logoWorkday logoErnst & Young logo
High performer badgeUsers love us badge
LinkedIn logoFacebook logoX logoInstagram logoTikTok logoYouTube logo