[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 611
  • Last Modified:

Batch text find & replace in pipe-delimited text file

I have a customer with a point of sale system that uses text files transferred to vendors to send data back and forth.  Long story short, we need to change the data in one column of several (like 30) text files from one number to another on every line (record) in each file.  But, it's only for the one column...the number that we're changing from may be in another column in the file and that would need to stay as is.  Oh, and the column number in question is different for each file.  i.e., for file1, it may be column1.  for file2, it may be column4.  I was looking at the following utility:
         http://stahlworks.com/dev/swiss-file-knife.html
but frankly, I got lost pretty quickly...maybe I didn't dig deep enough into the help for each different function, but there were so many functions and I didn't really know which ones I would need...

Any help would be greatly appreciated!!

Thanks,
Bob
0
breichard
Asked:
breichard
  • 11
  • 10
  • 10
2 Solutions
 
breichardAuthor Commented:
Oh, by the way, this all needs to be automated, as it needs to happen daily for all 30 files.  We already have the mechanisms for sending the files and the like.  I am comfortable with the XP scheduler, so if this can run in a batch environment on XP, that would be perfect.
Thanks!
0
 
ReneGeCommented:
Hi there. I am confused about what you need here.

-Could you please give sample data of before and after?
-How the new data in the colune to be changed is defined?
-Will it always be the same column?
-Also, are all these files in the same folder?
-Once the column is modified, should we delete the source (before) file backit up?

Cheers,
Rene
0
 
paultomasiCommented:
To change data in a specific column as you described, there needs to be something unique about the data in that column or, it would help if the columns had column-headers (I guess that's not the case here though) so that a prgram can identify the column by itself.

So, to automate the process, a program would need to distinguish the two columns from the others.

Can you attach a file and state which column needs to be changed and which column to take values from?
0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 
paultomasiCommented:
ReneGe

Oops! Seems we're both on this together...


breichard

ReneGe has asked a good question. Let me expand on that quaestion. Are the same columns constant for each file? For example:

    If say, File-A requires the 1st column to be updated from the 4th column, will File-A always follow this pattern?

    Similarly, if say, File-B requires the 2nd column to be updated from the 3rd column, will File-B always follow that pattern?

    Etc...

Also, are filenames always the same for each customer?
0
 
breichardAuthor Commented:
Hi, sorry, seems I did a pretty lousy job of describing this.  Below is a single line from a single file.  (Sorry, I'm having a hard time finding a file that is generic enough to post...I will keep looking.)  
   
      03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012

They all look like this format, pipe-delimited.  In all cases, we want to change 218 to 117.  In this particular file, that's in the 6th field.  It could randomly be in other fields, though (some of these files have a lot more fields to the right).  So, all lines in this file will always have the 218 in the 6th position.  I will look through the files more and if nothing else, I will scrub one for posting.  

Thanks!
Bob
0
 
ReneGeCommented:
@Paul: Team work is good :)
0
 
ReneGeCommented:
So in other words, you could have:
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012|1124
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42013

And the 6th column, would always have 218 that must be changed to 117.

Correct?
0
 
ReneGeCommented:
And what about my other questions?
0
 
paultomasiCommented:
This will process all .TXT files in the current folder. It will replace all occurances of '|218|' with '|217|' (however, this is NOT restricted to a particular column). Best to try it on test files first to see how it goes.
@echo off
setlocal enabledelayedexpansion

for %%a in (*.txt) do (
  (for /f "tokens=* usebackq" %%b in ("%%a") do (
    set line=%%b
    echo !line:^|218^|=^|217^|!
  ))>"%%~fa.tmp"
  move /y "%%~fa.tmp" "%%~fa"
)

Open in new window

NOTE: If '|218|' is likely to occur elsewhere in a file (other than the target column) then it too will be replaced with '|217|'.

ReneGe, does this meet with your approval?
0
 
breichardAuthor Commented:
Sorry for the poor response time, had someone show up at my office here just as you all started to reply.  Am working on replying to your questions now, will post shortly...
0
 
breichardAuthor Commented:
Ok, I just got a call from this customer and he got a call from his vendor.  The vendor found a way to deal with this on his side and we don't need to proceed.  So thank you both, and I apologize that we didn't really get to see this through.

Thanks again,
Bob
0
 
ReneGeCommented:
@Paul:
You don't need my approval :)

About your footer note, may I suggest a different approach?

@echo off
setlocal enabledelayedexpansion
set folder=C:\data
for /f "tokens=* usebackq" %%A in (%folder%\*.txt) do (
   (for /f "tokens=1,2,3,4,5,6* delims=^|" %%a in ("%%A") do (
      echo ^|%%a^|%%b^|%%c^|%%d^|%%e^|%%f^|%%g))>"%%~fa.tmp"
   move /y "%%~fa.tmp" "%%~fa"
)

Do you think that would work?

Cheers,
Rene
0
 
ReneGeCommented:
@breichard
And what if we finnish it just for the fun of it?
0
 
paultomasiCommented:
ReneGe, I don't think there's any point continuing with this... Good try though.
0
 
paultomasiCommented:
breichard

Any chance of points for the working solution above? I did deliver before the job was canceled... I don't mind sharing some with ReneGe
0
 
ReneGeCommented:
@breichard:
Paul helps me a lot, so I don't you giving all the points to him!

Cheers,
Rene
0
 
breichardAuthor Commented:
Rene, Paul, I would absolutely love to finish this just for the academic side of it.  The down side is I'm out of time today and I'm travelling tomorrow, so I won't be available until Thursday.  (There is also a real chance that this customer's vendor will come back and say "oops, mine doesn't work" so it may make sense to do this anyway.  But, I'd just like to know how.)  If both or either of you is amenable to helping me work this on Thursday or beyond, I'd love to do so.

As for points, I'm happy to award however you two decide...even if we do not proceed to solution.

Thanks,
Bob
0
 
paultomasiCommented:
Bob

Thank you for your frank reply. I appreciate you're on a timescale.

I say leave this question open for now and come back to us when you are sure how you want to proceed. I'm certain ReneGe will agree this course of action.
0
 
ReneGeCommented:
It's all fine with me!!

It was short, but fun :)

Cheers,
Rene
0
 
ReneGeCommented:
For what it's worth...

@ECHO OFF
SETLOCAL EnableDelayedExpansion
SET folder=C:\data
FOR /f "tokens=* usebackq" %%A in (%folder%\*.txt) DO (
   (FOR /f "tokens=1,2,3,4,5,6* delims=^|" %%a in ("%%A") DO (
      SET Nb=218
      IF %%f == 218 SET Nb=117
      ECHO ^|%%a^|%%b^|%%c^|%%d^|%%e^|!Nb!^|%%g))>"%%~fa.tmp"
      MOVE /y "%%~fa.tmp" "%%~fa"
)
0
 
breichardAuthor Commented:
Paul, Rene, thanks for your efforts, but I'm afraid my schedule is not going to allow me to see this through any time soon.<br /><br />Thanks,<br />Bob
0
 
paultomasiCommented:
IMPORTANT NOTE

I have just realised a data error in my code http:#37774004.

I have been changing the target data from '218' to '217' instead of '117'.

Line 7 of my code above:

    echo !line:^|218^|=^|217^|!

should be changed to:

    echo !line:^|218^|=^|117^|!

It will then work correctly for every file WITHOUT the need to manually check which column to change and WITHOUT the need to amend the batch file code accordingly each time. The batch file is designed to be fully automatic.

The error is fixed in the following listing:
@echo off
setlocal enabledelayedexpansion

for %%a in (*.txt) do (
  (for /f "tokens=* usebackq" %%b in ("%%a") do (
    set line=%%b
    echo !line:^|218^|=^|117^|!
  ))>"%%~fa.tmp"
  move /y "%%~fa.tmp" "%%~fa"
)

Open in new window

I suspect this may be the reason why the solution I gave earlier did not meet the asker's needs.

I am annoyed I did not identify this error much sooner. Apologies for the inconvenience.
0
 
breichardAuthor Commented:
Paul, the only issue here would have been that I would have need to specify which column to replace the data in per file.  i.e., file1 would have needed the data in column4 replaced (only) while file2 would have needed data in column7 replaced (only), etc.

Actually, the other issue would have been that the file names all start with the same name but end in different extensions.  But, I never told you that, as I was hoping to control this all file by file and not do a whole directory at a time...and I basically had zero time to deal with this at the time I got tasked to do so.  That's why I never really gave you full information to work with.

Anyway, thanks again for your efforts.
Bob
0
 
paultomasiCommented:
breichard

Thank you for replying.

My understanding is:

    1) Only one column in each file needs to be changed - the 'target' column.

    2) The target column differs from file to file.

    3) The target column consists of one or more rows containing '218' - the 'target' data.

    4) The target data needs to be replaced with '117'.

    5) There might be other occurances of '218' anywhere else in the file. These should not be altered.

The whole process should be automated:

    1) Process each file

    2) Identify and locate the target column

    3) Amend the target data
0
 
breichardAuthor Commented:
Hi Paul.  I think that pretty much wraps it up.  I would modify the automation part a bit.  I would prefer to control the automation myself.  i.e., I would like to have a script file that would accept the following parameters: filename to process, column target data is in, target data itself, data to replace target data with.

I realize the last two parameters are probably over the top a bit, but if we're designing something, we may as well design it to make it functional for more than just one use, ey?  Heck, maybe a last parameter should also be the delimeter character used in the file.

Your thoughts?  If you feel like actually working on this, then I will open up a new question if you want, and move this over there...?

Thanks,
Bob
0
 
breichardAuthor Commented:
PS, one point of clarification.  I would run the above script over and over via another script; calling it with different sets of parameters.  I'm not sure if that was clear.  That's what I meant by controlling the automation part myself...
0
 
ReneGeCommented:
@breichard

Paul is working hard here and I feel he is bringing a valued contribution.

Feel free to reopen this question so points are shared as they should be.

Cheers,
Rene

PS: Sorry for all my syntax errors (I'm sure I have). I need to sleep... :)
0
 
breichardAuthor Commented:
@ReneGe

*Can* I re-open a question?  Or did you mean open a new one?

Thanks!
Bob
0
 
paultomasiCommented:
ReneGe
No need to re-open the question. My concern is not the points.

breichard
I missed your previous comment. This sheds a whole new light on the matter.

I will respond later (I have to go out for a while).
0
 
paultomasiCommented:
breichard

No need to open a new question.

Where you state:
...parameters: filename to process, column target data is in, target data itself, data to replace target data with.
Are you certain you need to pass 'target data itself'? If you are passing 'column target data is in' then there would be no real need to pass this unless there is mixed data in the target column.


Also, where you state:
...the delimeter character used in the file.
Are you anticipating the "|" ('pipe' symbol) may not always be the delimiting character in every file? Or is there some other reason why you might want to pass this as a parameter?

Please clarify.
0
 
breichardAuthor Commented:
Paul,

No *need* to pass 'target data itself'.  Like I said, I was just thinking that if you're going to write this, you may as well make it as fliexible as possible so that it could be used in other scenarios.  The same thing applies to the delimeter character used.  My thought was simply to make it flexible.  It if makes the script that much harder to write, then it's not a necessary addition.  (Of course, none of this is strictly necessary at this point...you understand that, right?)

Thanks,
Bob
0

Featured Post

Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

  • 11
  • 10
  • 10
Tackle projects and never again get stuck behind a technical roadblock.
Join Now