gr8life
asked on
Regular Expression Help
I need help adding the regular expression to the Regex line to find the value “Not in Database” and error checking. Columns 5 and 8 are where the information may be located. Also the file is VbTab delimited.
Here is what I have so far:
Imports System.IO
Imports System.Text.RegularExpress ions
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Try
Dim re As New Regex( )
Dim input As String
Dim output As String
Dim sr As New StreamReader("C:\resource\ input.txt" )
input = sr.ReadToEnd()
sr.Close()
If input <> output Then
Dim sw As New StreamWriter("C:\resource\ NIDB.txt")
sw.Write(output)
sw.Close()
End If
Thanks for your time and expertise,
Gr8life
Here is what I have so far:
Imports System.IO
Imports System.Text.RegularExpress
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Try
Dim re As New Regex( )
Dim input As String
Dim output As String
Dim sr As New StreamReader("C:\resource\
input = sr.ReadToEnd()
sr.Close()
If input <> output Then
Dim sw As New StreamWriter("C:\resource\
sw.Write(output)
sw.Close()
End If
Thanks for your time and expertise,
Gr8life
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
After reading the reply from razorback041, I was wondering if his approach is more efficient to process large data sets up to 2 gig in size?
The "Not in Database" value can occur on multiple lines, twice on one line, or not on any line.
Sample data:
cat cat cat Not in Database cat cat Not in Database cat
dog dog dog Not in Database dog dog dog dog
rat rat rat rat rat rat rat rat
I added extra tabs to illustrate the columns better. Also the input file has more columns than I have posted here.
The error check I was referring to is if the document doesn’t have any occurrences of “Not in Database” then message box, No Matches.
Thank you for taking the time to read this post,
Gr8life
The "Not in Database" value can occur on multiple lines, twice on one line, or not on any line.
Sample data:
cat cat cat Not in Database cat cat Not in Database cat
dog dog dog Not in Database dog dog dog dog
rat rat rat rat rat rat rat rat
I added extra tabs to illustrate the columns better. Also the input file has more columns than I have posted here.
The error check I was referring to is if the document doesn’t have any occurrences of “Not in Database” then message box, No Matches.
Thank you for taking the time to read this post,
Gr8life
Hi gr8life;
The following code will read the input data into the variable input. search the input for "Not in Database" with the Regex object. Then writes all the lines that have the phrase in the line to a file.
Imports System.Text.RegularExpress ions
Imports System.IO
Dim re As New Regex("^.*?Not\sin\sDataba se.*?$", RegexOptions.Multiline)
Dim sr As New StreamReader("C:\resource\ input.txt" )
Dim input As String = sr.ReadToEnd()
sr.Close()
Dim mc As MatchCollection
Dim sw As New StreamWriter("C:\resource\ input_NotI nDB.txt")
mc = re.Matches(input)
For Each m As Match In mc
' If you have any matches in the input data the whole line that it
' appears in will be one of the matches from the match collection.
sw.Write(m.Value)
Next
sw.Close()
Fernando
The following code will read the input data into the variable input. search the input for "Not in Database" with the Regex object. Then writes all the lines that have the phrase in the line to a file.
Imports System.Text.RegularExpress
Imports System.IO
Dim re As New Regex("^.*?Not\sin\sDataba
Dim sr As New StreamReader("C:\resource\
Dim input As String = sr.ReadToEnd()
sr.Close()
Dim mc As MatchCollection
Dim sw As New StreamWriter("C:\resource\
mc = re.Matches(input)
For Each m As Match In mc
' If you have any matches in the input data the whole line that it
' appears in will be one of the matches from the match collection.
sw.Write(m.Value)
Next
sw.Close()
Fernando
ASKER
FernandoSoto, I tried the code you posted and I works great however it writes all the data to one line. Is there a way to write the data into several lines like this:
cat cat cat Not in Database cat cat Not in Database cat
dog dog dog Not in Database dog dog dog dog
instead of:
cat cat cat Not in Database cat cat Not in Database cat dog dog dog Not in Database dog dog dog dog
Thanks for your time,
Gr8life
cat cat cat Not in Database cat cat Not in Database cat
dog dog dog Not in Database dog dog dog dog
instead of:
cat cat cat Not in Database cat cat Not in Database cat dog dog dog Not in Database dog dog dog dog
Thanks for your time,
Gr8life
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
In fact the way the pattern is set up if you do not have the RegexOptions.Multiline in the creation of the Regex object it will take the whole input as only one line.
ASKER
It works great!
Thank you very much,
Gr8life
Thank you very much,
Gr8life
Not sure what you want. In your statement 'find the value "Not in Database” and error checking'. Does this mean that if column 5 has a value it will contain the phrase "Not in Database” and if column 5 has this phrase in it does that mean that you want the value in column then?
Can this happen on multiple lines in the file?
You define two string variables:
Dim input As String
Dim output As String
you assign a value to input and later on in the code you compare it to output but you never assign a value to output?
Let me know I would be glad to help with a Regex pattern.
Fernando