Avatar of LD147
LD147
Flag for United States of America asked on

Cleaning up text file with Visual Basic .net

I have a text file which contains the following:

<start>
-------------------------------------------------------
ID Num: BF00000000  Readings: 0000  Records: 000
Interval: 1 hour
------------------------------------------------------
Date & Time, Sample, Packet, ID, Reading, Status
------------------------------------------------------
ID Num: BF00000000  Readings: 0001  Records: 000
Interval: 1 hour
------------------------------------------------------
Date & Time, Sample, Packet, ID, Reading, Status
------------------------------------------------------
ID Num: CA07268139  Readings: 0004  Records: 001
Interval: 1 hour
------------------------------------------------------
Date & Time, Sample, Packet, ID, Reading, Status
2010-02-08 10:02, 0001, 001, 0007268139, 00000972, 063
2010-02-08 11:02, 0002, 001, 0007268139, 00000972, 063
2010-02-08 12:02, 0003, 001, 0007268139, 00000972, 063
2010-02-08 13:02, 0004, 001, 0007268139, 00000972, 063
2010-02-08 14:02, 0005, 001, 0007268139, 00000000, 000
2010-02-08 15:02, 0006, 001, 0007268139, 00000000, 000
2010-02-08 16:02, 0007, 001, 0007268139, 00000000, 000
2010-02-08 17:02, 0008, 001, 0007268139, 00000000, 000
<end>

Basically, any line that begins with a date needs to be kept.  All other information has to be discarded, so I will end up with a text file that looks like this:

2010-02-08 10:02, 0001, 001, 0007268139, 00000972, 063
2010-02-08 11:02, 0002, 001, 0007268139, 00000972, 063
2010-02-08 12:02, 0003, 001, 0007268139, 00000972, 063
2010-02-08 13:02, 0004, 001, 0007268139, 00000972, 063
2010-02-08 14:02, 0005, 001, 0007268139, 00000000, 000
2010-02-08 15:02, 0006, 001, 0007268139, 00000000, 000
2010-02-08 16:02, 0007, 001, 0007268139, 00000000, 000
2010-02-08 17:02, 0008, 001, 0007268139, 00000000, 000

What's the best way to do this?  I tried removing the junk lines with the attached code but it doesn't seem to work.  This is while reading the file line by line.  I've only included what I deem to be the relevant code (that does the cleaning).  I don't need to keep the header lines either.  Thanks a bunch!


If Microsoft.VisualBasic.Left(ioLine, 7) = "ID Num:" Then
                ioLine = ""
            End If
            If Microsoft.VisualBasic.Left(ioLine, 3) = "---" Then
                ioLine = ""
            End If
            If Microsoft.VisualBasic.Left(ioLine, 4) = "Date" Then
                ioLine = ""
            End If

Open in new window

.NET ProgrammingVisual Basic Classic

Avatar of undefined
Last Comment
LD147

8/22/2022 - Mon
hes

Can you just look only for the year 2010

If Microsoft.VisualBasic.Left(ioLine, 4) = "2010" Then
'write it back out
LD147

ASKER
Well, until December 31, I can look just for 2010.  Next year will be different ;)  i guess I could do something like If Microsoft.VisualBasic.Left(ioLine, 4) = "2010" or If Microsoft.VisualBasic.Left(ioLine, 4) = "2011" or If Microsoft.VisualBasic.Left(ioLine, 4) = "2012", etc....and just put a few years on, but I'm sure there's a more elegant way to do it.
ASKER CERTIFIED SOLUTION
13598

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
hes

Microsoft.VisualBasic.Left(ioLine, 4) = Date.Now.Year.ToString
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
13598

That would only work if the program runs over the file the same day.
Based on the sample file I didn't think it did but maybe it does.
What I mean is that if the program runs January 1st over the previous day Dec 31st then checking for the current year won't work.
If that is the case and you don't want to just check for numeric, then you could check for current or previous year:
mid(ioLine,1,4) = Now.Date.Year.ToString or mid(ioLine,1,4) = DateAdd(DateInterval.Year, -1, Now.Date.Year).ToString
LD147

ASKER
That seems to work wonderfully well!  Thanks a ton.
LD147

ASKER
13598:  your solution from earlier works well.  Only one thing, I always end up with a row of hyphens on the top ( seems to be the very first row in the file), but I'll figure out how to remove them....  thanks :)
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
13598

Without the code it would be hard to help. Maybe you can step through your code and see where, how and why the first line is being written.
LD147

ASKER
The other dotted lines are removed, no problem, just not the first line.  i always end up with this:

------------------------------------------------------
2010-02-08 10:02, 0001, 001, 0007268139, 00000972, 063
2010-02-08 11:02, 0002, 001, 0007268139, 00000972, 063
2010-02-08 12:02, 0003, 001, 0007268139, 00000972, 063
2010-02-08 13:02, 0004, 001, 0007268139, 00000972, 063
2010-02-08 14:02, 0005, 001, 0007268139, 00000000, 000
2010-02-08 15:02, 0006, 001, 0007268139, 00000000, 000
2010-02-08 16:02, 0007, 001, 0007268139, 00000000, 000
2010-02-08 17:02, 0008, 001, 0007268139, 00000000, 000

Not a biggie, although if you come up with a solution before me, feel free to post it ;)
' Load log file, clean, and show...
        Dim ioFile As New StreamReader("C:\probe\LOG_FILE.CSV")
        Dim ioLine As String ' Going to hold one line at a time
        Dim ioLines As String ' Going to hold whole file
        ioLine = ioFile.ReadLine
        ioLines = ioLine

        While Not ioLine = ""
            ioLine = ioFile.ReadLine

            If IsNumeric(Mid(ioLine, 1, 4)) Then ' only keep lines beginning with numbers (ie, the date)
                ioLines = ioLines & vbCrLf & ioLine
            End If

        End While

        txtMain.Text = ioLines ' show clean log file in window

Open in new window

13598

Without knowing the rest of your code I would just make sure things are clear and there is no garbage left. Try this (you can never do too much cleaning). Give your string variables a value of blank in the declaration.Plus you are skipping your very  first line. You read it outside your while loop and then read again the next line without analyzing the first line read:
  Dim ioFile As New StreamReader("C:\probe\LOG_FILE.CSV")
        Dim ioLine As String = ""          ' Going to hold one line at a time
        Dim ioLines As String  = ""       ' Going to hold whole file
        ioLine = ioFile.ReadLine
        ioLines = ioLine

        While Not ioLine = ""
           
            If IsNumeric(Mid(ioLine, 1, 4)) Then ' only keep lines beginning with numbers (ie, the date)
                ioLines = ioLines & vbCrLf & ioLine
            End If
 ioLine = ioFile.ReadLine

        End While
 txtMain.Text.clear
        txtMain.Text = ioLines ' show clean log file in window
 
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
LD147

ASKER
Many thanks.  It's ok now :)