Link to home
Start Free TrialLog in
Avatar of breichard
breichard

asked on

Batch text find & replace in pipe-delimited text file

I have a customer with a point of sale system that uses text files transferred to vendors to send data back and forth.  Long story short, we need to change the data in one column of several (like 30) text files from one number to another on every line (record) in each file.  But, it's only for the one column...the number that we're changing from may be in another column in the file and that would need to stay as is.  Oh, and the column number in question is different for each file.  i.e., for file1, it may be column1.  for file2, it may be column4.  I was looking at the following utility:
         http://stahlworks.com/dev/swiss-file-knife.html
but frankly, I got lost pretty quickly...maybe I didn't dig deep enough into the help for each different function, but there were so many functions and I didn't really know which ones I would need...

Any help would be greatly appreciated!!

Thanks,
Bob
Avatar of breichard
breichard

ASKER

Oh, by the way, this all needs to be automated, as it needs to happen daily for all 30 files.  We already have the mechanisms for sending the files and the like.  I am comfortable with the XP scheduler, so if this can run in a batch environment on XP, that would be perfect.
Thanks!
Hi there. I am confused about what you need here.

-Could you please give sample data of before and after?
-How the new data in the colune to be changed is defined?
-Will it always be the same column?
-Also, are all these files in the same folder?
-Once the column is modified, should we delete the source (before) file backit up?

Cheers,
Rene
To change data in a specific column as you described, there needs to be something unique about the data in that column or, it would help if the columns had column-headers (I guess that's not the case here though) so that a prgram can identify the column by itself.

So, to automate the process, a program would need to distinguish the two columns from the others.

Can you attach a file and state which column needs to be changed and which column to take values from?
ReneGe

Oops! Seems we're both on this together...


breichard

ReneGe has asked a good question. Let me expand on that quaestion. Are the same columns constant for each file? For example:

    If say, File-A requires the 1st column to be updated from the 4th column, will File-A always follow this pattern?

    Similarly, if say, File-B requires the 2nd column to be updated from the 3rd column, will File-B always follow that pattern?

    Etc...

Also, are filenames always the same for each customer?
Hi, sorry, seems I did a pretty lousy job of describing this.  Below is a single line from a single file.  (Sorry, I'm having a hard time finding a file that is generic enough to post...I will keep looking.)  
   
      03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012

They all look like this format, pipe-delimited.  In all cases, we want to change 218 to 117.  In this particular file, that's in the 6th field.  It could randomly be in other fields, though (some of these files have a lot more fields to the right).  So, all lines in this file will always have the 218 in the 6th position.  I will look through the files more and if nothing else, I will scrub one for posting.  

Thanks!
Bob
@Paul: Team work is good :)
So in other words, you could have:
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012|1124
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42013

And the 6th column, would always have 218 that must be changed to 117.

Correct?
And what about my other questions?
SOLUTION
Avatar of Paul Tomasi
Paul Tomasi
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Sorry for the poor response time, had someone show up at my office here just as you all started to reply.  Am working on replying to your questions now, will post shortly...
Ok, I just got a call from this customer and he got a call from his vendor.  The vendor found a way to deal with this on his side and we don't need to proceed.  So thank you both, and I apologize that we didn't really get to see this through.

Thanks again,
Bob
@Paul:
You don't need my approval :)

About your footer note, may I suggest a different approach?

@echo off
setlocal enabledelayedexpansion
set folder=C:\data
for /f "tokens=* usebackq" %%A in (%folder%\*.txt) do (
   (for /f "tokens=1,2,3,4,5,6* delims=^|" %%a in ("%%A") do (
      echo ^|%%a^|%%b^|%%c^|%%d^|%%e^|%%f^|%%g))>"%%~fa.tmp"
   move /y "%%~fa.tmp" "%%~fa"
)

Do you think that would work?

Cheers,
Rene
@breichard
And what if we finnish it just for the fun of it?
ReneGe, I don't think there's any point continuing with this... Good try though.
breichard

Any chance of points for the working solution above? I did deliver before the job was canceled... I don't mind sharing some with ReneGe
@breichard:
Paul helps me a lot, so I don't you giving all the points to him!

Cheers,
Rene
Rene, Paul, I would absolutely love to finish this just for the academic side of it.  The down side is I'm out of time today and I'm travelling tomorrow, so I won't be available until Thursday.  (There is also a real chance that this customer's vendor will come back and say "oops, mine doesn't work" so it may make sense to do this anyway.  But, I'd just like to know how.)  If both or either of you is amenable to helping me work this on Thursday or beyond, I'd love to do so.

As for points, I'm happy to award however you two decide...even if we do not proceed to solution.

Thanks,
Bob
Bob

Thank you for your frank reply. I appreciate you're on a timescale.

I say leave this question open for now and come back to us when you are sure how you want to proceed. I'm certain ReneGe will agree this course of action.
It's all fine with me!!

It was short, but fun :)

Cheers,
Rene
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Paul, Rene, thanks for your efforts, but I'm afraid my schedule is not going to allow me to see this through any time soon.<br /><br />Thanks,<br />Bob
IMPORTANT NOTE

I have just realised a data error in my code http:#37774004.

I have been changing the target data from '218' to '217' instead of '117'.

Line 7 of my code above:

    echo !line:^|218^|=^|217^|!

should be changed to:

    echo !line:^|218^|=^|117^|!

It will then work correctly for every file WITHOUT the need to manually check which column to change and WITHOUT the need to amend the batch file code accordingly each time. The batch file is designed to be fully automatic.

The error is fixed in the following listing:
@echo off
setlocal enabledelayedexpansion

for %%a in (*.txt) do (
  (for /f "tokens=* usebackq" %%b in ("%%a") do (
    set line=%%b
    echo !line:^|218^|=^|117^|!
  ))>"%%~fa.tmp"
  move /y "%%~fa.tmp" "%%~fa"
)

Open in new window

I suspect this may be the reason why the solution I gave earlier did not meet the asker's needs.

I am annoyed I did not identify this error much sooner. Apologies for the inconvenience.
Paul, the only issue here would have been that I would have need to specify which column to replace the data in per file.  i.e., file1 would have needed the data in column4 replaced (only) while file2 would have needed data in column7 replaced (only), etc.

Actually, the other issue would have been that the file names all start with the same name but end in different extensions.  But, I never told you that, as I was hoping to control this all file by file and not do a whole directory at a time...and I basically had zero time to deal with this at the time I got tasked to do so.  That's why I never really gave you full information to work with.

Anyway, thanks again for your efforts.
Bob
breichard

Thank you for replying.

My understanding is:

    1) Only one column in each file needs to be changed - the 'target' column.

    2) The target column differs from file to file.

    3) The target column consists of one or more rows containing '218' - the 'target' data.

    4) The target data needs to be replaced with '117'.

    5) There might be other occurances of '218' anywhere else in the file. These should not be altered.

The whole process should be automated:

    1) Process each file

    2) Identify and locate the target column

    3) Amend the target data
Hi Paul.  I think that pretty much wraps it up.  I would modify the automation part a bit.  I would prefer to control the automation myself.  i.e., I would like to have a script file that would accept the following parameters: filename to process, column target data is in, target data itself, data to replace target data with.

I realize the last two parameters are probably over the top a bit, but if we're designing something, we may as well design it to make it functional for more than just one use, ey?  Heck, maybe a last parameter should also be the delimeter character used in the file.

Your thoughts?  If you feel like actually working on this, then I will open up a new question if you want, and move this over there...?

Thanks,
Bob
PS, one point of clarification.  I would run the above script over and over via another script; calling it with different sets of parameters.  I'm not sure if that was clear.  That's what I meant by controlling the automation part myself...
@breichard

Paul is working hard here and I feel he is bringing a valued contribution.

Feel free to reopen this question so points are shared as they should be.

Cheers,
Rene

PS: Sorry for all my syntax errors (I'm sure I have). I need to sleep... :)
@ReneGe

*Can* I re-open a question?  Or did you mean open a new one?

Thanks!
Bob
ReneGe
No need to re-open the question. My concern is not the points.

breichard
I missed your previous comment. This sheds a whole new light on the matter.

I will respond later (I have to go out for a while).
breichard

No need to open a new question.

Where you state:
...parameters: filename to process, column target data is in, target data itself, data to replace target data with.
Are you certain you need to pass 'target data itself'? If you are passing 'column target data is in' then there would be no real need to pass this unless there is mixed data in the target column.


Also, where you state:
...the delimeter character used in the file.
Are you anticipating the "|" ('pipe' symbol) may not always be the delimiting character in every file? Or is there some other reason why you might want to pass this as a parameter?

Please clarify.
Paul,

No *need* to pass 'target data itself'.  Like I said, I was just thinking that if you're going to write this, you may as well make it as fliexible as possible so that it could be used in other scenarios.  The same thing applies to the delimeter character used.  My thought was simply to make it flexible.  It if makes the script that much harder to write, then it's not a necessary addition.  (Of course, none of this is strictly necessary at this point...you understand that, right?)

Thanks,
Bob
Hi ,

I have a similar requirement wherein I have to replace a value at 10th position of every line in the pipe delimited file. I used the solution provided by  ReneGe. I have one problem though.. my file has null values at certain places because of which the code doesnt work properly. Could you please help me out?
Hi Rini,

Please create a related question and share its link here.

https://www.experts-exchange.com/askQuestion.jsp?linkedType=10&linkedId=27651028

Cheers