Parse text file - regex or something else?

Hello! I have a text file I would like to parse, this data I plan to insert into a database.
Here is some sample data
<MEMO>12/24 RIVERVIEW    FL 8010I441922

1. Loop through text and locatate the <STMTTRN> </STMTTRN> blocks
2. extract out data and set variables TRNTYPE = POS, DTPOSTED = 20051227170000
Then I can insert into the db

Who is Participating?
Fernando SotoRetiredCommented:
Hi JRockFL;

The following code should do what you want.

Imports System.IO
Imports System.Text.RegularExpressions

        Dim TRNTYPE As String
        Dim DTPOSTED As String

        Dim pattern As String = "<STMTTRN>.*?<TRNTYPE>(?<TRNTYPE>.*?)\n" & _
        Dim re As New Regex(pattern, _
            RegexOptions.Compiled Or RegexOptions.Singleline)
        Dim mc As MatchCollection
        Dim sr As New StreamReader("C:\Temp\InputData.dat")
        Dim input As String = sr.ReadToEnd()

        mc = re.Matches(input)
        For Each m As Match In mc
            TRNTYPE = m.Groups("TRNTYPE").Value()
            DTPOSTED = m.Groups("DTPOSTED").Value
            ' Do what you need to do and write to Database

I hope that this is of some help.


Hey JRockFL; does the FL in JRockFL stand for Florida?
JRockFLAuthor Commented:
Hey Fernando

Thank you for the reply, that is exactly what I am looking for. I figured I needed a regex, I just found an article on code project that goes into regex and what all the symbols mean. Are there any good reference web sites?

Yes, I'm in Florida, just outside of Tampa.
Fernando SotoRetiredCommented:
Hi JRockFL;

I use this site when I want to test a pattern, . And I use the Microsoft documentation because not all Regex are the same, Unix, POSIX standard, and of course Microsoft. The documentation web site is

You could also download the program called The Regulator, a Regex pattern testing software at this link

BTW I live in Apopka just noth of Orlando.

Good Luck

Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

JRockFLAuthor Commented:
Thanks for the links! I will check them out.
I need a more basic example to understand this...
Dim pattern As String = "<STMTTRN>.*?<TRNTYPE>(?<TRNTYPE>.*?)\n" & _

How would you write it to pull out the word Apopka?


Cool! You enjoying this nice weather too? It got cold yesterday!

Fernando SotoRetiredCommented:

This Regex pattern :

Will start looking for the regular characters <STMTTRN>. The next symbol is a . which is a Regex meta-character that stands for any single character. The next symbol is the * which stands for 0 or more repetitions of the symbol before it. The ? tells the Regex engine take the smallest repetition up to it finds the regular characters <TRNTYPE>. The next set of symbols (?<TRNTYPE>.*?) is a named capture group which is defined as (?<TheNameOfTheCapture>The characters to capture). The \n is a Regex meta-character which is the new line character. Then we look for the next info <DTPOSTED> then capture the info in a named capture group and then another new line character. Then we search till we find the end of the info which is </STMTTRN>. Then if there is more characters in the input string it starts looking from the beginning of the pattern.

Pattern string would be "<city>(?<City>\w+)</city>"

In this pattern we search for the regular string <city> when we find that it captures the next set of word characters which is represented by the \w and looks for 1 or more word characters. Word characters are defined as the set of the following characters, [a-zA-Z_0-9]. When it hits a non word character it checks that it is </city>.

There you have it.

JRockFLAuthor Commented:
That was perfect!! Thank you.
Fernando SotoRetiredCommented:
No problem.
JRockFLAuthor Commented:
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.