Solved

Modifying lines in text file

Posted on 2007-03-23
43
257 Views
Last Modified: 2010-04-16
Hi,

in my text file (it's rather big) I 'd like to do some modifications based on certain conditions.
I have 2 scenarios (I need a simple script for each one separately):
1) delete certain character (let's say "Tab character" or number "9") from a line if length of the line is less than 200 and save the output  into a new text file.
2) if the length of the line is less than 200 append the next following line to it (again save the output into a new text file)
I'd prefer using VB script or something which would work with huge text files.

Thanks.
0
Comment
Question by:janime
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 20
  • 20
  • 3
43 Comments
 
LVL 22

Expert Comment

by:WMIF
ID: 18783979
something like this should get you going.

set fs = Server.CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\test.txt",1,false)
do until f.atendofstream
  curline = f.readline
  if len(curline) < 200 then
    curline = replace(curline, vbtab, "")
    curline = replace(curline, "9", "")
  end if
loop
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18783984
oops forget the writing to the new file.

set fs = Server.CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\test.txt",1,false)
set f2 = fs.CreateTextFile("c:\test.txt",true)
do until f.atendofstream
  curline = f.readline
  if len(curline) < 200 then
    curline = replace(curline, vbtab, "")
    curline = replace(curline, "9", "")
  end if
  f2.writeline(curline)
loop
0
 
LVL 13

Expert Comment

by:rettiseert
ID: 18784007
   Dim fso
    Dim txtStream
    Dim fileIn
    Dim fileOut
    Dim LineIn
    Dim LineOut
    Dim txtStreamOut
    Dim txtOut
   
    fileIn = "c:\test.txt"
    fileOut = "c:\testoutput.txt"
   
    Const ForWriting = 2
   
    Set fso = CreateObject("Scripting.FileSystemObject")
   
    Set txtStream = fso.OpenTextFile(fileIn)

    Do While Not (txtStream.atEndOfStream)
       
        LineIn = txtStream.ReadLine
       
        LineOut = LineIn
       
        If Len(LineIn) > 200 Then
            LineOut = Replace(LineOut, vbTab, "")   'Delete tab
            LineOut = Replace(LineOut, "9", "")     'Delete
        End If
       
        txtOut = txtOut + LineOut + vbCrLf
       
    Loop
   
    Set txtStream = Nothing
   
    Set txtStreamOut = fso.OpenTextFile(fileOut, ForWriting, True)

    txtStreamOut.WriteLine txtOut
   
    Set txtStreamOut = Nothing
0
DevOps Toolchain Recommendations

Read this Gartner Research Note and discover how your IT organization can automate and optimize DevOps processes using a toolchain architecture.

 
LVL 13

Expert Comment

by:rettiseert
ID: 18784008
oops... a little late
0
 
LVL 13

Expert Comment

by:rettiseert
ID: 18784021
(and in my code you should change

If Len(LineIn) > 200 Then

to

If Len(LineIn) < 200 Then

0
 
LVL 22

Expert Comment

by:WMIF
ID: 18784052
for the other you will need a recursive function if i understand what you are trying to do.  that is, i think you are trying to make lines at least 200 characters long.  so you would want to add the next line if the current line is less than 200.  then you would like to keep adding lines until the line is longer than 200.  is that true?

set fs = Server.CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\test.txt",1,false)
set f2 = fs.CreateTextFile("c:\test.txt",true)
function getnextline(curline)
  curline = curline & f.readline
  if len(curline) < 200 then getnextline(curline)
  getnextline = curline
end function
do until f.atendofstream
  curline = f.readline
  if len(curline) < 200 then curline = getnextline(curline)
  f2.writeline(curline)
loop
f.close
f2.close
0
 

Author Comment

by:janime
ID: 18784214
Thanks guys, I'm testing first script. WMIF I'm getting error regarding "Server.CreatObject..."  I wanna run it on my local computer..
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18784222
sorry, i have a habit with asp code.  almost the same as vbscript alone, just with the "server." stuff. :)  if you remove that it should run fine.
0
 

Author Comment

by:janime
ID: 18784226
By the way, how to describe the end of the line code/carriage return  (in hex it's 0D0A)..
0
 

Author Comment

by:janime
ID: 18784245
I will narrow it down to that "carriage return". I'd like to delete it from the shorter line thus append the next following line to it. Yes, this way the shortest line would contain 200 characters..
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18784383
in vbscript you can use a reserved keyword to represent each, or together.  though i believe that using the readline method reads the entire line minus the cr and lf at the end.  is this not the case with the code above?

vbcrlf
vbcr
vblf
0
 

Author Comment

by:janime
ID: 18784623
Yes, you're right.  The readline method skips the CR at the end,  it uses CR to recognize the end of the line. So the script doesn't work in that case. I need something which would read entire line - with CR at the end - so I can delete CR if line is less than 200 characters long.
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18785594
did you try my second script?  you dont need the cr because you are reading one line at a time.  you know how long it is.
0
 

Author Comment

by:janime
ID: 18786099
Thank you, WMIF, the second script works for this case, so I guess we have both cases covered. Great!
0
 

Author Comment

by:janime
ID: 18786167
Ooops, it looks like the dcripts doesn't work properly if the following line start with an "tab" character.  OR maybe it is something else as when I change the length to 270 (this is the length of the following line) I'm getting this error:"Input past end of the file". (Line 7, Char 8 - which is getnextline = curline)..
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18793993
just a bit more checking will fix that.

set fs = Server.CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\test.txt",1,false)
set f2 = fs.CreateTextFile("c:\test.txt",true)
function getnextline(curline)
  if not f.atendofstream then curline = curline & f.readline '<< check for end of file before getting line
  if len(curline) < 200 then getnextline(curline)
  getnextline = curline
end function
do until f.atendofstream
  curline = f.readline
  if len(curline) < 200 then curline = getnextline(curline)
  f2.writeline(curline)
loop
f.close
f2.close
0
 

Author Comment

by:janime
ID: 18796627
And here is the last condition. I found out that in my textile there a re lines like this (for ilustration I'm using 80 characters):

A   Y   N   2006   05   Market Stock US  -10125.68   John Smith   A   11:34:55AM  USD  N
A   Y   N   2006   06   Inventory Reconciliation Stock   2725.68   Lisa
Bernard   A   10:22:43AM  

USD  N
A   Y   N   2006   04   Investment Fonds  -95125.25  Theodore Johnson   A   4:04:12PM  USD  N

In this case the bottom line appends to "USD  N" which is not correct. Before "USD N" there is an empty line..

The shifted line should look like this:
A   Y   N   2006   06   Inventory Reconciliation Stock   2725.68   Lisa Bernard   A   10:22:43AM  USD  N


Thank you..


0
 
LVL 22

Expert Comment

by:WMIF
ID: 18796682
i think i might have found the problem.  i dont think that it was the extra space causing the problem, just happened to work out there.  try this:

set fs = Server.CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\test.txt",1,false)
set f2 = fs.CreateTextFile("c:\test.txt",true)
function getnextline(curline)
  if not f.atendofstream then curline = curline & f.readline
  if len(curline) < 200 then curline = getnextline(curline) '<< not setting the result of function to curline
  getnextline = curline
end function
do until f.atendofstream
  curline = f.readline
  if len(curline) < 200 then curline = getnextline(curline)
  f2.writeline(curline)
loop
f.close
f2.close
0
 

Author Comment

by:janime
ID: 18797263
Hmmm, looks like it works except when the line starts witn a tab character (\t) - all the lines starting with a "tab" where not shifted. Tha'ts strange. Can you see why?
0
 

Author Comment

by:janime
ID: 18797293
Or I do not know. Can I send you the sample file somehow?
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18800735
you can upload the file to www.ee-stuff.com and associate it with the question id.

that is strange with the tab character not working.  let me know when it is up there.
0
 

Author Comment

by:janime
ID: 18804587
Thank you,  here is the file..

https://filedb.experts-exchange.com/incoming/ee-stuff/3002-test_sample.zip 

The "cut-off" line position is probably greater than 200, you'll have to find out..
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18809640
i just ran the script against that file, and it appears to be working just fine.  the only thing that i think needs to be added is a space in between the lines so the values dont get run together.

what line number were you having a problem with?


if you want to add a space between the values:
set fs = CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\scripts\test\test_sample.txt",1,false)
set f2 = fs.CreateTextFile("c:\scripts\test\test.txt",true)
function getnextline(curline)
  if not f.atendofstream then curline = curline & " " & f.readline
  if len(curline) < 200 then curline = getnextline(curline)
function to curline
  getnextline = curline
end function
do until f.atendofstream
  curline = f.readline
  if len(curline) < 200 then curline = getnextline(curline)
  f2.writeline(curline)
loop
f.close
f2.close
0
 

Author Comment

by:janime
ID: 18813016
I 'm getting "Syntax error" on line 7 char 1. I probably sent you a wrong file. I tried to upload a new one - no success. I spent one hour trying to get various file sizes ( txt or zip ) 500KB ~ 3 MB through unfortunately it did not work. It drives me crazy!  What an EE improvement!!! Do you know about any other file exchange site? Thank you.
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18817870
i dont know of any other sites myself.  i see what the problem is on line 7.  it was part of the comment that i made earlier and it didnt get deleted properly.  you can remove all of line 7: "function to curline".
0
 

Author Comment

by:janime
ID: 18821159
Try to run your script on this file http://www.ducted.com/files/jtest2.zip 

Thanks.
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18824303
i dont get any errors with that file either.
0
 

Author Comment

by:janime
ID: 18827320
Please have a closer look at lines 116  or 235 (after running your script they are still shifted).
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18837597
ah, try this now.  the space was in the wrong place.

set fs = CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\scripts\test\jtest2.txt",1,false)
set f2 = fs.CreateTextFile("c:\scripts\test\test.txt",true)
function getnextline(curline)
  if not f.atendofstream then curline = curline & f.readline & " "
  if len(curline) < 200 then curline = getnextline(curline)
  getnextline = curline
end function
do until f.atendofstream
  curline = f.readline
  if len(curline) < 200 then curline = getnextline(curline)
  f2.writeline(curline)
loop
f.close
f2.close

0
 

Author Comment

by:janime
ID: 18840367
Thanks, but nothing's changed - those 2 lines are still shifted in the final test.txt file. I took a closer look - it's because the length of those 2 lines in the original file jtest2.txt is 259. But when I change the condition in the script "<260" I'm getting 'Out of memory "f.atendofstream" error. Hmmm, I'll keep trying to play around with it but for now it still doesn't work the way I need it.
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18847441
i had to expand the if statement to include the next couple lines.  everything seems to be working fine now.

set fs = CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\scripts\test\jtest2.txt",1,false)
set f2 = fs.CreateTextFile("c:\scripts\test\test.txt",true)
function getnextline(curline)
  if not f.atendofstream then
    curline = curline & f.readline & " "
    if len(curline) < 350 then curline = getnextline(curline)
    getnextline = curline
  end if
end function
do until f.atendofstream
  curline = f.readline
  if len(curline) < 350 then curline = getnextline(curline)
  f2.writeline(curline)
loop
f.close
f2.close
0
 

Author Comment

by:janime
ID: 18870690
Still getting those 2 lines shifted... (59 and 119 in test.txt)....
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18877140
can you explain further what you mean by it being shifted.  running the code i pasted above, i cannot find any problems with the output.  i need to know exactly what you are seeing to be able to fix this.
0
 

Author Comment

by:janime
ID: 18878731
this is the output I'm getting:
 http://www.ducted.com/files/test.zip  

see the "shifted" line:
http://www.ducted.com/files/test.txt_output.JPG

I'm using the last script posted with jtest2.txt file..

0
 
LVL 22

Expert Comment

by:WMIF
ID: 18878834
i have ran the script several more times and come up with the same result.  i have even opened the test.zip file you posted and get no problems.  i looked at your screen shot and see what you are talking about, but i cant duplicate it.  i suspect that it has something to do with textpad since i am using notepad to view my results.

im working with a hex editor now to see whats going on.
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18878901
alright, it looks like the readline method is grabbing the line and stripping the LF character, but it is leaving the CR character.  im not sure why its leaving that, but this function fixes that.  notepad is a bit forgiving i guess where textpad breaks the line at either LF or CR.

set fs = CreateObject("Scripting.FileSystemObject")
set f = fs.OpenTextFile("c:\scripts\test\jtest2.txt",1,false)
set f2 = fs.CreateTextFile("c:\scripts\test\test.txt",true)
function getnextline(curline)
  if not f.atendofstream then
    curline = curline & replace(f.readline,vbcr,"") & " "
    if len(curline) < 350 then curline = getnextline(curline)
    getnextline = curline
  end if
end function
do until f.atendofstream
  curline = replace(f.readline,vbcr,"")
  if len(curline) < 350 then curline = getnextline(curline)
  f2.writeline(curline)
loop
f.close
f2.close
0
 

Author Comment

by:janime
ID: 18879779
I just want to point out- it doesn't matter if you're using notepad or texpad - when loading data into my database - the columns get shifted ( exactly on the same spot as shown in my picture). Anyways, I'm gonna give it a try. Thanks.
0
 

Author Comment

by:janime
ID: 18879817
This is the ouptut I'm getting now:

http://www.ducted.com/files/test_Apr9.zip

the very first lane "Header" has the next data line attached to it. I did not check  any further - basically - it dosn't work again..:/
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18883400
its probably because the line length is set to 350.  you had suggested to increase the length further up in this thread.  is this the length that you need?
0
 

Author Comment

by:janime
ID: 18887401
I put  line length  < 280 then < 250 etc and  it does the same. Why don't you try? Just use that "jtest2.txt" I sent you. Thanks.
0
 
LVL 22

Accepted Solution

by:
WMIF earned 500 total points
ID: 18894379
i got that line to be on its own by setting the size to 220.  that is how many characters are in that line which appears to be a column header list.
0
 

Author Comment

by:janime
ID: 18895651
Thanks WMIF!!! Now it works!
Points well deserved! Thanks a lot for your effort.
0
 
LVL 22

Expert Comment

by:WMIF
ID: 18897722
glad we finally got it working.  :)
0

Featured Post

Resolve Critical IT Incidents Fast

If your data, services or processes become compromised, your organization can suffer damage in just minutes and how fast you communicate during a major IT incident is everything. Learn how to immediately identify incidents & best practices to resolve them quickly and effectively.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Over the years I have built up my own little library of code snippets that I refer to when programming or writing a script.  Many of these have come from the web or adaptations from snippets I find on the Web.  Periodically I add to them when I come…
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
Learn the basics of while and for loops in Python.  while loops are used for testing while, or until, a condition is met: The structure of a while loop is as follows:     while <condition>:         do something         repeate: The break statement m…
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

735 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question