Solved

Batch text find & replace in pipe-delimited text file

Posted on 2012-03-27
31
553 Views
Last Modified: 2012-04-07
I have a customer with a point of sale system that uses text files transferred to vendors to send data back and forth.  Long story short, we need to change the data in one column of several (like 30) text files from one number to another on every line (record) in each file.  But, it's only for the one column...the number that we're changing from may be in another column in the file and that would need to stay as is.  Oh, and the column number in question is different for each file.  i.e., for file1, it may be column1.  for file2, it may be column4.  I was looking at the following utility:
         http://stahlworks.com/dev/swiss-file-knife.html
but frankly, I got lost pretty quickly...maybe I didn't dig deep enough into the help for each different function, but there were so many functions and I didn't really know which ones I would need...

Any help would be greatly appreciated!!

Thanks,
Bob
0
Comment
Question by:breichard
  • 11
  • 10
  • 10
31 Comments
 

Author Comment

by:breichard
ID: 37773603
Oh, by the way, this all needs to be automated, as it needs to happen daily for all 30 files.  We already have the mechanisms for sending the files and the like.  I am comfortable with the XP scheduler, so if this can run in a batch environment on XP, that would be perfect.
Thanks!
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 37773684
Hi there. I am confused about what you need here.

-Could you please give sample data of before and after?
-How the new data in the colune to be changed is defined?
-Will it always be the same column?
-Also, are all these files in the same folder?
-Once the column is modified, should we delete the source (before) file backit up?

Cheers,
Rene
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37773690
To change data in a specific column as you described, there needs to be something unique about the data in that column or, it would help if the columns had column-headers (I guess that's not the case here though) so that a prgram can identify the column by itself.

So, to automate the process, a program would need to distinguish the two columns from the others.

Can you attach a file and state which column needs to be changed and which column to take values from?
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37773753
ReneGe

Oops! Seems we're both on this together...


breichard

ReneGe has asked a good question. Let me expand on that quaestion. Are the same columns constant for each file? For example:

    If say, File-A requires the 1st column to be updated from the 4th column, will File-A always follow this pattern?

    Similarly, if say, File-B requires the 2nd column to be updated from the 3rd column, will File-B always follow that pattern?

    Etc...

Also, are filenames always the same for each customer?
0
 

Author Comment

by:breichard
ID: 37773777
Hi, sorry, seems I did a pretty lousy job of describing this.  Below is a single line from a single file.  (Sorry, I'm having a hard time finding a file that is generic enough to post...I will keep looking.)  
   
      03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012

They all look like this format, pipe-delimited.  In all cases, we want to change 218 to 117.  In this particular file, that's in the 6th field.  It could randomly be in other fields, though (some of these files have a lot more fields to the right).  So, all lines in this file will always have the 218 in the 6th position.  I will look through the files more and if nothing else, I will scrub one for posting.  

Thanks!
Bob
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 37773788
@Paul: Team work is good :)
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 37773816
So in other words, you could have:
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42012|132012|1124
03/20/2012|3|13|1|4|218|2012|1|04:00:00 03/20/2012|42013

And the 6th column, would always have 218 that must be changed to 117.

Correct?
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 37773822
And what about my other questions?
0
 
LVL 11

Assisted Solution

by:paultomasi
paultomasi earned 250 total points
ID: 37774004
This will process all .TXT files in the current folder. It will replace all occurances of '|218|' with '|217|' (however, this is NOT restricted to a particular column). Best to try it on test files first to see how it goes.
@echo off
setlocal enabledelayedexpansion

for %%a in (*.txt) do (
  (for /f "tokens=* usebackq" %%b in ("%%a") do (
    set line=%%b
    echo !line:^|218^|=^|217^|!
  ))>"%%~fa.tmp"
  move /y "%%~fa.tmp" "%%~fa"
)

Open in new window

NOTE: If '|218|' is likely to occur elsewhere in a file (other than the target column) then it too will be replaced with '|217|'.

ReneGe, does this meet with your approval?
0
 

Author Comment

by:breichard
ID: 37774083
Sorry for the poor response time, had someone show up at my office here just as you all started to reply.  Am working on replying to your questions now, will post shortly...
0
 

Author Comment

by:breichard
ID: 37774123
Ok, I just got a call from this customer and he got a call from his vendor.  The vendor found a way to deal with this on his side and we don't need to proceed.  So thank you both, and I apologize that we didn't really get to see this through.

Thanks again,
Bob
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 37774169
@Paul:
You don't need my approval :)

About your footer note, may I suggest a different approach?

@echo off
setlocal enabledelayedexpansion
set folder=C:\data
for /f "tokens=* usebackq" %%A in (%folder%\*.txt) do (
   (for /f "tokens=1,2,3,4,5,6* delims=^|" %%a in ("%%A") do (
      echo ^|%%a^|%%b^|%%c^|%%d^|%%e^|%%f^|%%g))>"%%~fa.tmp"
   move /y "%%~fa.tmp" "%%~fa"
)

Do you think that would work?

Cheers,
Rene
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 37774174
@breichard
And what if we finnish it just for the fun of it?
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37774346
ReneGe, I don't think there's any point continuing with this... Good try though.
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37774361
breichard

Any chance of points for the working solution above? I did deliver before the job was canceled... I don't mind sharing some with ReneGe
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 10

Expert Comment

by:ReneGe
ID: 37774383
@breichard:
Paul helps me a lot, so I don't you giving all the points to him!

Cheers,
Rene
0
 

Author Comment

by:breichard
ID: 37774472
Rene, Paul, I would absolutely love to finish this just for the academic side of it.  The down side is I'm out of time today and I'm travelling tomorrow, so I won't be available until Thursday.  (There is also a real chance that this customer's vendor will come back and say "oops, mine doesn't work" so it may make sense to do this anyway.  But, I'd just like to know how.)  If both or either of you is amenable to helping me work this on Thursday or beyond, I'd love to do so.

As for points, I'm happy to award however you two decide...even if we do not proceed to solution.

Thanks,
Bob
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37774492
Bob

Thank you for your frank reply. I appreciate you're on a timescale.

I say leave this question open for now and come back to us when you are sure how you want to proceed. I'm certain ReneGe will agree this course of action.
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 37774518
It's all fine with me!!

It was short, but fun :)

Cheers,
Rene
0
 
LVL 10

Accepted Solution

by:
ReneGe earned 250 total points
ID: 37774776
For what it's worth...

@ECHO OFF
SETLOCAL EnableDelayedExpansion
SET folder=C:\data
FOR /f "tokens=* usebackq" %%A in (%folder%\*.txt) DO (
   (FOR /f "tokens=1,2,3,4,5,6* delims=^|" %%a in ("%%A") DO (
      SET Nb=218
      IF %%f == 218 SET Nb=117
      ECHO ^|%%a^|%%b^|%%c^|%%d^|%%e^|!Nb!^|%%g))>"%%~fa.tmp"
      MOVE /y "%%~fa.tmp" "%%~fa"
)
0
 

Author Closing Comment

by:breichard
ID: 37792313
Paul, Rene, thanks for your efforts, but I'm afraid my schedule is not going to allow me to see this through any time soon.<br /><br />Thanks,<br />Bob
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37792459
IMPORTANT NOTE

I have just realised a data error in my code http:#37774004.

I have been changing the target data from '218' to '217' instead of '117'.

Line 7 of my code above:

    echo !line:^|218^|=^|217^|!

should be changed to:

    echo !line:^|218^|=^|117^|!

It will then work correctly for every file WITHOUT the need to manually check which column to change and WITHOUT the need to amend the batch file code accordingly each time. The batch file is designed to be fully automatic.

The error is fixed in the following listing:
@echo off
setlocal enabledelayedexpansion

for %%a in (*.txt) do (
  (for /f "tokens=* usebackq" %%b in ("%%a") do (
    set line=%%b
    echo !line:^|218^|=^|117^|!
  ))>"%%~fa.tmp"
  move /y "%%~fa.tmp" "%%~fa"
)

Open in new window

I suspect this may be the reason why the solution I gave earlier did not meet the asker's needs.

I am annoyed I did not identify this error much sooner. Apologies for the inconvenience.
0
 

Author Comment

by:breichard
ID: 37793225
Paul, the only issue here would have been that I would have need to specify which column to replace the data in per file.  i.e., file1 would have needed the data in column4 replaced (only) while file2 would have needed data in column7 replaced (only), etc.

Actually, the other issue would have been that the file names all start with the same name but end in different extensions.  But, I never told you that, as I was hoping to control this all file by file and not do a whole directory at a time...and I basically had zero time to deal with this at the time I got tasked to do so.  That's why I never really gave you full information to work with.

Anyway, thanks again for your efforts.
Bob
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37814114
breichard

Thank you for replying.

My understanding is:

    1) Only one column in each file needs to be changed - the 'target' column.

    2) The target column differs from file to file.

    3) The target column consists of one or more rows containing '218' - the 'target' data.

    4) The target data needs to be replaced with '117'.

    5) There might be other occurances of '218' anywhere else in the file. These should not be altered.

The whole process should be automated:

    1) Process each file

    2) Identify and locate the target column

    3) Amend the target data
0
 

Author Comment

by:breichard
ID: 37814323
Hi Paul.  I think that pretty much wraps it up.  I would modify the automation part a bit.  I would prefer to control the automation myself.  i.e., I would like to have a script file that would accept the following parameters: filename to process, column target data is in, target data itself, data to replace target data with.

I realize the last two parameters are probably over the top a bit, but if we're designing something, we may as well design it to make it functional for more than just one use, ey?  Heck, maybe a last parameter should also be the delimeter character used in the file.

Your thoughts?  If you feel like actually working on this, then I will open up a new question if you want, and move this over there...?

Thanks,
Bob
0
 

Author Comment

by:breichard
ID: 37814326
PS, one point of clarification.  I would run the above script over and over via another script; calling it with different sets of parameters.  I'm not sure if that was clear.  That's what I meant by controlling the automation part myself...
0
 
LVL 10

Expert Comment

by:ReneGe
ID: 37814385
@breichard

Paul is working hard here and I feel he is bringing a valued contribution.

Feel free to reopen this question so points are shared as they should be.

Cheers,
Rene

PS: Sorry for all my syntax errors (I'm sure I have). I need to sleep... :)
0
 

Author Comment

by:breichard
ID: 37816023
@ReneGe

*Can* I re-open a question?  Or did you mean open a new one?

Thanks!
Bob
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37816122
ReneGe
No need to re-open the question. My concern is not the points.

breichard
I missed your previous comment. This sheds a whole new light on the matter.

I will respond later (I have to go out for a while).
0
 
LVL 11

Expert Comment

by:paultomasi
ID: 37819070
breichard

No need to open a new question.

Where you state:
...parameters: filename to process, column target data is in, target data itself, data to replace target data with.
Are you certain you need to pass 'target data itself'? If you are passing 'column target data is in' then there would be no real need to pass this unless there is mixed data in the target column.


Also, where you state:
...the delimeter character used in the file.
Are you anticipating the "|" ('pipe' symbol) may not always be the delimiting character in every file? Or is there some other reason why you might want to pass this as a parameter?

Please clarify.
0
 

Author Comment

by:breichard
ID: 37819495
Paul,

No *need* to pass 'target data itself'.  Like I said, I was just thinking that if you're going to write this, you may as well make it as fliexible as possible so that it could be used in other scenarios.  The same thing applies to the delimeter character used.  My thought was simply to make it flexible.  It if makes the script that much harder to write, then it's not a necessary addition.  (Of course, none of this is strictly necessary at this point...you understand that, right?)

Thanks,
Bob
0

Featured Post

Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Suggested Solutions

The following is a collection of cases for strange behaviour when using advanced techniques in DOS batch files. You should have some basic experience in batch "programming", as I'm assuming some knowledge and not further explain the basics. For some…
Introduction: Recently, I got a requirement to zip all files individually with batch file script in Windows OS. I don't know much about scripting, but I searched Google and found a lot of examples and websites to complete my task. Finally, I was ab…
Internet Business Fax to Email Made Easy - With eFax Corporate (http://www.enterprise.efax.com), you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…
Illustrator's Shape Builder tool will let you combine shapes visually and interactively. This video shows the Mac version, but the tool works the same way in Windows. To follow along with this video, you can draw your own shapes or download the file…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now