Link to home
Start Free TrialLog in
Avatar of wiestassoc
wiestassoc

asked on

Unix script to break up file

need help with a Unix script.

1) Need to split a file that by fixed text and use part of the data to name the file.

Source file A

$$$$|DEF|ADDD
HDR|12345678|444|rhrhrh|hghghghg
LINE|33333|444444|ghththg
LINE|THHE|rrr|5555
LINE|TEHEHE|5555|urjurjr
HDR|234567890|444|rhrhrh|hghghghg
LINE|33333|444444|ghththg
LINE|THHE|rrr|5555
LINE|TEHEHE|5555|urjurjr

2) $$$$ starts new file
3) 2nd field in HDR is part of file name

Here is output

File B  name TESTTfile_12345678_20180130.txt

HDR|12345678|444|rhrhrh|hghghghg
LINE|33333|444444|ghththg
LINE|THHE|rrr|5555
LINE|TEHEHE|5555|urjurjr

File C  name TESTTfile_234567890_20180130.txt

HDR|234567890|444|rhrhrh|hghghghg
LINE|33333|444444|ghththg
LINE|THHE|rrr|5555
LINE|TEHEHE|5555|urjurjr

then original file saved to /archive

thanks!
ASKER CERTIFIED SOLUTION
Avatar of Bill Prew
Bill Prew

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of wiestassoc
wiestassoc

ASKER

Gents.. thanks so much.  I will get them a try .
There has been a change to the requirements.

Can anyone adjust a solution

1) Replace any "&" in file to "H"
2) Replace any "#" in file to "I"  (Letter I)
3) Break file by $$$$$$$.
4) Use the second entry in the $$$$$$$$ as part of the file name
5) Remove the $$$$$$$$ from the new file
6) Place new file into a new directory

Parameters

$INDIR = /data/out/
$OUTDIR = /data/newout


Input file:

Name sad.123.sad

$$$$$$$$|I231_0081788682|
HEADER|INV|20180224|20180224||0004165036|0004165036|0081788682|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788682|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788682|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788682|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|0081788682|20|180182C02|20180118
&EADER|DELV|20180224|20180224||0004165036|0004165036|0081788682|||||||||||
#TEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788682|10|180211C01|20180121
#TEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788682|10|180503C02|20180219
#TEM|900001|20|87QS5930150|1381|9999|200|CS|0081788682|20|180181C01|20180118
#TEM|900002|20|87QS5930150|1381|9999|100|CS|0081788682|20|180182C02|20180118
$$$$$$$$|I231_0081788684|
HEADER|INV|20180224|20180224||0004165036|0004165036|0081788684|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788684|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788684|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788684|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|0081788684|20|180182C02|20180118
&EADER|DELV|20180224|20180224||0004165036|0004165036|0081788684|||||||||||
#TEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788684|10|180211C01|20180121
#TEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788684|10|180503C02|20180219
#TEM|900001|20|87QS5930150|1381|9999|200|CS|0081788684|20|180181C01|20180118
#TEM|900002|20|87QS5930150|1381|9999|100|CS|0081788684|20|180182C02|20180118
$$$$$$$$|I266_0081788699|
HEADER|INV|20180224|20180224||0004165036|0004165036|0081788699|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788699|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788699|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788699|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|0081788699|20|180182C02|20180118
&EADER|DELV|20180224|20180224||0004165036|0004165036|0081788699|||||||||||
#TEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788699|10|180211C01|20180121
#TEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788699|10|180503C02|20180219
#TEM|900001|20|87QS5930150|1381|9999|200|CS|0081788699|20|180181C01|20180118
#TEM|900002|20|87QS5930150|1381|9999|100|CS|0081788699|20|180182C02|20180118

Break into files (This sample above is 3)

1  File    Should be named:  I231_0081788682_02252018_051500.txt

HEADER|INV|20180224|20180224||0004165036|0004165036|0081788682|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788682|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788682|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788682|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|0081788682|20|180182C02|20180118
HEADER|DELV|20180224|20180224||0004165036|0004165036|0081788682|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788682|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788682|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788682|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|0081788682|20|180182C02|20180118

2) file: should be            Should be named:  I231_0081788684_02252018_051500.txt
HEADER|INV|20180224|20180224||0004165036|0004165036|0081788684|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788684|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788684|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788684|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|0081788684|20|180182C02|20180118
HEADER|DELV|20180224|20180224||0004165036|0004165036|0081788684|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788684|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788684|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788684|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|0081788684|20|180182C02|20180118

3) File   name    Should be named:  I266_0081788699_02252018_051500.txt
HEADER|INV|20180224|20180224||0004165036|0004165036|0081788699|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788699|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788699|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788699|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|0081788699|20|180182C02|20180118
HEADER|DELV|20180224|20180224||0004165036|0004165036|0081788699|||||||||||
ITEM|900001|10|4L7S6101367|1381|9999|100|CS|0081788699|10|180211C01|20180121
ITEM|900002|10|4L7S6101367|1381|9999|100|CS|0081788699|10|180503C02|20180219
ITEM|900001|20|87QS5930150|1381|9999|200|CS|0081788699|20|180181C01|20180118
ITEM|900002|20|87QS5930150|1381|9999|100|CS|008178869|20|180182C02|20180118
Weeks later...

Hi wiestassoc,
I see you've now opened a new question for this new requirement.  That was the right thing to do, because it's quite different to the original requirement, which has already been answered.

Going back to your original request, here's a shell script which uses Perl to do the main processing, and is very similar to Abhimanyu's beautifully concise shell/awk answer:

#!/bin/bash
export DATE=`date +%Y%m%d`
perl -pe 'open STDOUT, ">>TESTFILE_$1_$ENV{DATE}.txt" if /^HDR\|(.+?)\|/' $*
mv $* /archive

Open in new window

If you put that in a script named script1.sh, and give it execute permission, you could run it like this:
./script1.sh input_file(s)
As you can see, it can process more than one file per execution.  For example, if you want to process all files starting with "A", you could do this:
./script1.sh A*