Brothernod
asked on
Parse text file, trim each line, remove carriage returns, add CRLF after delimiter character?
I believe this is a fairly simple question, but unfortunately I have no experience working with text files so I'm asking for help rather than figuring it out myself.
I need to read in a text file (it's going to be 5 megs up to a couple hundred).
I need to do the following with it
1) I need to remove the trailing spaces from each line.
2) Delete all carriage returns and line feeds.
3) Add a Carriage Return/ Line Feed after any instance of "</END??>" where the ?? can be anything.
I believe each line is around 1000 bytes. Speed doesn't really matter.
I mention the byte count since I think this is a large file and should use something like a string buffer no? Not really familiar with anything other than ordinary strings.
I'll gladly bump up the point value for a quick quality solution.
I need to read in a text file (it's going to be 5 megs up to a couple hundred).
I need to do the following with it
1) I need to remove the trailing spaces from each line.
2) Delete all carriage returns and line feeds.
3) Add a Carriage Return/ Line Feed after any instance of "</END??>" where the ?? can be anything.
I believe each line is around 1000 bytes. Speed doesn't really matter.
I mention the byte count since I think this is a large file and should use something like a string buffer no? Not really familiar with anything other than ordinary strings.
I'll gladly bump up the point value for a quick quality solution.
here ya go
Dim source, target
source = "myfile.txt"
target = "myfile2.txt"
Dim fso, f, f2
dim x, y
Const ForReading = 1, ForWriting = 2
Set fso = CreateObject("Scripting.FileSystemObject")
If fso.FileExists(source) Then
Dim inputLine, outputLine
Set f = fso.OpenTextFile(source, ForReading, False)
Set f2 = fso.OpenTextFile(target, ForWriting, True)
While Not f.AtEndOfStream
inputLine = f.ReadLine
outputline = parseme(inputLine)
f2.Write outputline
Wend
f.Close
f2.Close
Set f2 = Nothing
Set f = Nothing
MsgBox "Done"
Else
MsgBox source, vbOKOnly, "Source File Not Found"
End If
Set fso = Nothing
function parseme(strtext)
' remove trailing lines
parseme = rtrim(strtext)
'remove carriage returns
parseme = Replace(parseme, vbCr, "")
'remove line feeds
parseme = Replace(parseme, vblf, "")
'find "</END??>" and add vbcrlf
x = instr(parseme, "</END") 'if exists find a location
'loop to find multiple '</end"'s
do until x = 0
y = instr(x, parseme, ">")
parseme = left(parseme, y+1) & vbcrlf & mid(parseme, y+1) 'starting on character past the ">"
x = instr(y, parseme, "</END")
loop
end function
hmm,.. i did a simple vbscript,.. but i'm sure you don't really have to change it to make it work with vb
just name the file something.vbs and edit the 'inputfile and outputfilenames.
or setup as arguments or something.
just name the file something.vbs and edit the 'inputfile and outputfilenames.
or setup as arguments or something.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I won't lie, I was really hoping for a nice clean vb.net example.
BUT
I needed this quick, and weelio did provide a quick complete solution.
Thank you.
BUT
I needed this quick, and weelio did provide a quick complete solution.
Thank you.
ASKER
It's in VBA not VB.NET but I guess it was close enough since I'm in a hurry :)
Dim line as String
To remove new lines and CRs, use:
line = line.Replace(chr(10).ToStr
line = line.Replace(char(13).ToSt
Then remove trailing spaces
line = line.TrimEnd(null);
about the </END??> issue, you can do it with an algorithm:
Dim i as Integer = 0
Do
i = line.IndexOf("</END", i)
If i < 0 Then Exit Do
i = line.IndexOf(">", i)
If i < 0 Then Exit Do
line = line.Insert(i+1, char(13) & chr(10))
While True