[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

AWK script to count and then replace

Posted on 2008-11-10
20
Medium Priority
?
482 Views
Last Modified: 2012-05-05
Hi All,

I am looking for a simple awk script which can count the segment delimeters of a file and then replace the count in the UNT segment.

For example i am showing the below instance of a file:-

UNA:+.?*'
UNB+UNOB:4+0050:ZZZ+DBH+20081105:1803+0811051803'
UNG+IFTMIN+0050+3995+20081105:1803+0811051803+UN+D:99AEBL02'
UNH+0811051803+IFTMIN:99A:UNEBL02'
BGM+705+C00005738+5'
CTA+MS+:EnterpriseBatchProcessor'
DTM+137:200811031729:203'
FTX+AEA++9'
CNT+11:22:PCE'
CNT+7:22500:KGM'
CNT+15:0.053:MTQ'
DOC+710++EI+2+0'
LOC+57+DEHAM:::Hamburg'
RFF+FF:C00005738'
TDT+20+125+1+13:OCEAN VESSEL+00009999:172+++:::COSCO LONG BEACH'
DTM+133:20081105:102'
LOC+9+DEHAM:::Hamburg'
LOC+12+SGSIN:::Singapore'
NAD+CN+00004225+11 JOO KOON CIRCLE:JURONG:11 JOO KOON CIRCLE:SINGAPORE+ALFA LAVAL SINGAPORE PTE LTD++++629043'
NAD+CZ+00007306+ALTENAER STRASSE 72-76-58675 HEMER:ALTENAER STRASSE 72-76-58675 HEMER:HH+MATRIX VERTRIEBSSERVICE GMBH'
CTA+IC+:EnterpriseBatchProcessor'
NAD+N1+00012726+Amselstrasse::Amselstrasse:Hamburg:HH+Hanjin Shipping Co. Ltd. Schiffsmak++++20457'
GID+1+22:PA:::PACKET'
FTX+AAA+++MACHINERY PARTS'
MEA+WT+AAE+KGM:22500'
MEA+VOL+AAE+MTQ:0.053'
RFF+BN:HJSHAM123456'
PCI++ALFA LAVAL SINGAPORE'
SGP+HJCU8521230+22'
EQD+CN+HJCU8521230+42G0'
TMD+3:FCL/FCL+FCL'
SEL+123963+SH'
RFF+BN:SDEHAG0000023'
UNT+31+0811051803'
UNE+1+0811051803'
UNZ+1+0811051803'

We need to count the number of segment delimeters i.e. the symbol ' from the UNH to the UNT segment and then replace the UNT+31 the 31 value with the exact count.

Please help. Any other information i will let you know.

Regards
Karan
0
Comment
Question by:Pankaj_Sachdeva
  • 12
  • 8
20 Comments
 
LVL 85

Expert Comment

by:ozo
ID: 22928072
awk '/^UNH/,/^UNT/{if( index($0,"'"'"'") ){c++}};{sub(/^UNT\+[0-9]+/,"UNT+" c);print}'
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928102
Hi I tried to run the above awk script on the below file :-

UNB+UNOB:4+0050:ZZZ+DBH+20081110:1129+0811101129'
UNG+IFTMIN+0050+3995+20081110:1129+0811101129+UN+D:99A:DEBL02'
UNH+0811101129+IFTMIN:D:99A:UN:DEBL02'
BGM+705+C00005752+9'
CTA+MS+:EnterpriseBatchProcessor'
DTM+137:200811041510:203'
FTX+AEA++5'
CNT+11:0:PCE'
CNT+7:700:KGM'
CNT+15:5:MTQ'
LOC+57+DEBRE:::Bremen'
RFF+FF:C00005752'
TDT+20+VO0411086+1+13:OCEAN VESSEL+00009999:172+++:::ANL EXPLORER'
DTM+133:20081106:102'
LOC+9+DEBRE:::Bremen'
LOC+12+CAZZZ:::DUMMY PORT'
NAD+CN+00001142+DSV AIR & SEA INC.:100 Walnut Avenue:Suite 405:Clark:NJ+++++07066'
NAD+CZ+00001062+DSV AIR & SEA GMBH:Schlachte 15/18:LKW ABTEILUNG:Bremen:HB+++++28195'
CTA+IC+:EnterpriseBatchProcessor'
NAD+N1+00012726+Hanjin Shipping Co. Ltd. Schiffsmak:Amselstrasse::Hamburg:HH+++++20457'
GID+1+0:PA:::PACKET'
PIA+5+041106:CC:169'
FTX+AAA+++desc0411086'
MEA+WT+AAE+KGM:700'
MEA+VOL+AAE+MTQ:5'
PCI++M&M0411086'
SGP+CONT0411060+0'
EQD+CN+CONT0411060+2060'
TMD+2:LCL/LCL+LCL'
SEL+seal041108+SH'
UNT+30+0811101129'
UNE+1+0811101129'
UNZ+1+0811101129'

The UNT count is mentioned as 30 here however if you check the number of lines from UNH to UNT segment having segment delimeter as ' is 29.

I ran your awk script and the output it gave me was same it didnt change the UNT count to 29.

Thanks for your help..

Regards

Karan
0
 
LVL 85

Expert Comment

by:ozo
ID: 22928139
I was counting the ' at the end of UNT+30+0811101129'
If don't want to count it, you can reverse the substitte and the count
awk '{sub(/^UNT\+[0-9]+/,"UNT+" c);print};/^UNH/,/^UNT/{if( index($0,"'"'"'") ){c++}}'
0
How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

 

Author Comment

by:Pankaj_Sachdeva
ID: 22928148
I am also attaching a sample file which is not new line terminated. the whole data would come in a single line with segment delimeter as '.

Thanks

Karan
DBHA.005M3C.IFTMIN.000901115.txt
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928157
Thanks ozo but i tried both the solutions on the above attached file but they are still giving me UNT count as 30 in the output whereas it should be 29.

I need to count the number of segment delimeters ' from the UNH segment to the UNT segment.

Thanks for your help

Karan
0
 
LVL 85

Expert Comment

by:ozo
ID: 22928238
I count 33 in that file
awk '{match($0,/UNH.*UNT/);u=substr($0,RSTART,RLENGTH);sub(/'"'UNT\+[0-9]+/,"'"'"'UNT+"'" gsub(/'"'"'/,FS,$0));print}' < DBHA.005M3C.IFTMIN.000901115.txt
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928255
Count the number of segment delimeters i.e. the symbol ' in that file starting from the UNH segment to the UNT segment.

Below please see:-

UNH+0811101129+IFTMIN:D:99A:UN:DEBL02'
BGM+705+C00005752+9'
CTA+MS+:EnterpriseBatchProcessor'
DTM+137:200811041510:203'
FTX+AEA++5'
CNT+11:0:PCE'
CNT+7:700:KGM'
CNT+15:5:MTQ'
LOC+57+DEBRE:::Bremen'
RFF+FF:C00005752'
TDT+20+VO0411086+1+13:OCEAN VESSEL+00009999:172+++:::ANL EXPLORER'
DTM+133:20081106:102'
LOC+9+DEBRE:::Bremen'
LOC+12+CAZZZ:::DUMMY PORT'
NAD+CN+00001142+DSV AIR & SEA INC.:100 Walnut Avenue:Suite 405:Clark:NJ+++++07066'
NAD+CZ+00001062+DSV AIR & SEA GMBH:Schlachte 15/18:LKW ABTEILUNG:Bremen:HB+++++28195'
CTA+IC+:EnterpriseBatchProcessor'
NAD+N1+00012726+Hanjin Shipping Co. Ltd. Schiffsmak:Amselstrasse::Hamburg:HH+++++20457'
GID+1+0:PA:::PACKET'
PIA+5+041106:CC:169'
FTX+AAA+++desc0411086'
MEA+WT+AAE+KGM:700'
MEA+VOL+AAE+MTQ:5'
PCI++M&M0411086'
SGP+CONT0411060+0'
EQD+CN+CONT0411060+2060'
TMD+2:LCL/LCL+LCL'
SEL+seal041108+SH'
UNT+30+0811101129'

The number of count of segment delimeter ' is coming to be 29 if we count manually.

Thanks

Karan
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928274
And after substituting the value of the UNT counter the whole file should come as it was only the UNT counter should be changed nothing else.
Please help its really urgent for me
Thanks
0
 
LVL 85

Expert Comment

by:ozo
ID: 22928300
I get 29 if we count like http:#a22928072 and 28 if we count like http:#a22928139
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928310
Can you tell me how are you running this awk script i am running as below :-
awk '/^UNH/,/^UNT/{if( index($0,"'"'"'") ){c++}};{sub(/^UNT\+[0-9]+/,"UNT+" c);print}'  inputfilename
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928320
Im getting UNT counter as 30 for both can you please help.... :(
0
 
LVL 85

Expert Comment

by:ozo
ID: 22928342
cat >  inputfilename <<ENDHERE
UNH+0811101129+IFTMIN:D:99A:UN:DEBL02'
BGM+705+C00005752+9'
CTA+MS+:EnterpriseBatchProcessor'
DTM+137:200811041510:203'
FTX+AEA++5'
CNT+11:0:PCE'
CNT+7:700:KGM'
CNT+15:5:MTQ'
LOC+57+DEBRE:::Bremen'
RFF+FF:C00005752'
TDT+20+VO0411086+1+13:OCEAN VESSEL+00009999:172+++:::ANL EXPLORER'
DTM+133:20081106:102'
LOC+9+DEBRE:::Bremen'
LOC+12+CAZZZ:::DUMMY PORT'
NAD+CN+00001142+DSV AIR & SEA INC.:100 Walnut Avenue:Suite 405:Clark:NJ+++++07066'
NAD+CZ+00001062+DSV AIR & SEA GMBH:Schlachte 15/18:LKW ABTEILUNG:Bremen:HB+++++28195'
CTA+IC+:EnterpriseBatchProcessor'
NAD+N1+00012726+Hanjin Shipping Co. Ltd. Schiffsmak:Amselstrasse::Hamburg:HH+++++20457'
GID+1+0:PA:::PACKET'
PIA+5+041106:CC:169'
FTX+AAA+++desc0411086'
MEA+WT+AAE+KGM:700'
MEA+VOL+AAE+MTQ:5'
PCI++M&M0411086'
SGP+CONT0411060+0'
EQD+CN+CONT0411060+2060'
TMD+2:LCL/LCL+LCL'
SEL+seal041108+SH'
UNT+30+0811101129'
ENDHERE
awk '/^UNH/,/^UNT/{if( index($0,"'"'"'") ){c++}};{sub(/^UNT\+[0-9]+/,"UNT+" c);print}'  inputfilename
UNH+0811101129+IFTMIN:D:99A:UN:DEBL02'
BGM+705+C00005752+9'
CTA+MS+:EnterpriseBatchProcessor'
DTM+137:200811041510:203'
FTX+AEA++5'
CNT+11:0:PCE'
CNT+7:700:KGM'
CNT+15:5:MTQ'
LOC+57+DEBRE:::Bremen'
RFF+FF:C00005752'
TDT+20+VO0411086+1+13:OCEAN VESSEL+00009999:172+++:::ANL EXPLORER'
DTM+133:20081106:102'
LOC+9+DEBRE:::Bremen'
LOC+12+CAZZZ:::DUMMY PORT'
NAD+CN+00001142+DSV AIR & SEA INC.:100 Walnut Avenue:Suite 405:Clark:NJ+++++07066'
NAD+CZ+00001062+DSV AIR & SEA GMBH:Schlachte 15/18:LKW ABTEILUNG:Bremen:HB+++++28195'
CTA+IC+:EnterpriseBatchProcessor'
NAD+N1+00012726+Hanjin Shipping Co. Ltd. Schiffsmak:Amselstrasse::Hamburg:HH+++++20457'
GID+1+0:PA:::PACKET'
PIA+5+041106:CC:169'
FTX+AAA+++desc0411086'
MEA+WT+AAE+KGM:700'
MEA+VOL+AAE+MTQ:5'
PCI++M&M0411086'
SGP+CONT0411060+0'
EQD+CN+CONT0411060+2060'
TMD+2:LCL/LCL+LCL'
SEL+seal041108+SH'
UNT+29+0811101129'
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928434
didnt get your last point what you meant...can you tell me how your running the awk script on the input file
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928446
see the below i ran the first awk you created on the input file and it gave me the below result:-
/dfds03/editst6/pankaj >{sub(/^UNT\+[0-9]+/,"UNT+" c);print}' DBHA.005M3C.IFTMIN.000901115                                <
UNB+UNOB:4+0050:ZZZ+DBH+20081110:1129+0811101129'UNG+IFTMIN+0050+3995+20081110:1129+0811101129+UN+D:99A:DEBL02'UNH+0811101129+IFTMIN:D:99A:UN:DEBL02'BGM+705+C00005752+9'CTA+MS+:EnterpriseBatchProcessor'DTM+137:200811041510:203'FTX+AEA++5'CNT+11:0:PCE'CNT+7:700:KGM'CNT+15:5:MTQ'LOC+57+DEBRE:::Bremen'RFF+FF:C00005752'TDT+20+VO0411086+1+13:OCEAN VESSEL+00009999:172+++:::ANL EXPLORER'DTM+133:20081106:102'LOC+9+DEBRE:::Bremen'LOC+12+CAZZZ:::DUMMY PORT'NAD+CN+00001142+DSV AIR & SEA INC.:100 Walnut Avenue:Suite 405:Clark:NJ+++++07066'NAD+CZ+00001062+DSV AIR & SEA GMBH:Schlachte 15/18:LKW ABTEILUNG:Bremen:HB+++++28195'CTA+IC+:EnterpriseBatchProcessor'NAD+N1+00012726+Hanjin Shipping Co. Ltd. Schiffsmak:Amselstrasse::Hamburg:HH+++++20457'GID+1+0:PA:::PACKET'PIA+5+041106:CC:169'FTX+AAA+++desc0411086'MEA+WT+AAE+KGM:700'MEA+VOL+AAE+MTQ:5'PCI++M&M0411086'SGP+CONT0411060+0'EQD+CN+CONT0411060+2060'TMD+2:LCL/LCL+LCL'SEL+seal041108+SH'UNT+30+0811101129'UNE+1+0811101129'UNZ+1+0811101129'
0
 
LVL 85

Expert Comment

by:ozo
ID: 22928638
http:#a22928342 creates inputfilename,
runs awk on inputfilename,
and shows the result

what was the input file and command used to produce http:#a22928446 ?
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928674
Command to produce the output in http:#a22928446 ? was
awk '/^UNH/,/^UNT/{if( index($0,"'"'"'") ){c++}};{sub(/^UNT\+[0-9]+/,"UNT+" c);print}' DBHA.005M3C.IFTMIN.000901115.txt
The input file has been attached for your reference.
Karan

DBHA.005M3C.IFTMIN.000901115.txt
0
 
LVL 85

Accepted Solution

by:
ozo earned 2000 total points
ID: 22928722
As I said in http:#a22928238
for a file all on one line, running
awk '{match($0,/UNH.*UNT/);u=substr($0,RSTART,RLENGTH);sub(/'"'UNT\+[0-9]+/,"'"'"'UNT+"'" gsub(/'"'"'/,FS,u));print}' < DBHA.005M3C.IFTMIN.000901115.txt
produces
UNB+UNOB:4+0050:ZZZ+DBH+20081110:1129+0811101129'UNG+IFTMIN+0050+3995+20081110:1129+0811101129+UN+D:99A:DEBL02'UNH+0811101129+IFTMIN:D:99A:UN:DEBL02'BGM+705+C00005752+9'CTA+MS+:EnterpriseBatchProcessor'DTM+137:200811041510:203'FTX+AEA++5'CNT+11:0:PCE'CNT+7:700:KGM'CNT+15:5:MTQ'LOC+57+DEBRE:::Bremen'RFF+FF:C00005752'TDT+20+VO0411086+1+13:OCEAN VESSEL+00009999:172+++:::ANL EXPLORER'DTM+133:20081106:102'LOC+9+DEBRE:::Bremen'LOC+12+CAZZZ:::DUMMY PORT'NAD+CN+00001142+DSV AIR & SEA INC.:100 Walnut Avenue:Suite 405:Clark:NJ+++++07066'NAD+CZ+00001062+DSV AIR & SEA GMBH:Schlachte 15/18:LKW ABTEILUNG:Bremen:HB+++++28195'CTA+IC+:EnterpriseBatchProcessor'NAD+N1+00012726+Hanjin Shipping Co. Ltd. Schiffsmak:Amselstrasse::Hamburg:HH+++++20457'GID+1+0:PA:::PACKET'PIA+5+041106:CC:169'FTX+AAA+++desc0411086'MEA+WT+AAE+KGM:700'MEA+VOL+AAE+MTQ:5'PCI++M&M0411086'SGP+CONT0411060+0'EQD+CN+CONT0411060+2060'TMD+2:LCL/LCL+LCL'SEL+seal041108+SH'UNT+28+0811101129'UNE+1+0811101129'UNZ+1+0811101129'
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22928846
if you bring it to new line terminated and then check the count of ' between UNH and UNT segments its 29.
Can you please check that manually. The correct UNT counter should be 29.
0
 
LVL 85

Expert Comment

by:ozo
ID: 22928942
Here's how I count
UNH+0811101129+IFTMIN:D:99A:UN:DEBL02'<1>BGM+705+C00005752+9'<2>CTA+MS+:EnterpriseBatchProcessor'<3>DTM+137:200811041510:203'<4>FTX+AEA++5'<5>CNT+11:0:PCE'<6>CNT+7:700:KGM'<7>CNT+15:5:MTQ'<8>LOC+57+DEBRE:::Bremen'<9>RFF+FF:C00005752'<10>TDT+20+VO0411086+1+13:OCEAN VESSEL+00009999:172+++:::ANL EXPLORER'<11>DTM+133:20081106:102'<12>LOC+9+DEBRE:::Bremen'<13>LOC+12+CAZZZ:::DUMMY PORT'<14>NAD+CN+00001142+DSV AIR & SEA INC.:100 Walnut Avenue:Suite 405:Clark:NJ+++++07066'<15>NAD+CZ+00001062+DSV AIR & SEA GMBH:Schlachte 15/18:LKW ABTEILUNG:Bremen:HB+++++28195'<16>CTA+IC+:EnterpriseBatchProcessor'<17>NAD+N1+00012726+Hanjin Shipping Co. Ltd. Schiffsmak:Amselstrasse::Hamburg:HH+++++20457'<18>GID+1+0:PA:::PACKET'<19>PIA+5+041106:CC:169'<20>FTX+AAA+++desc0411086'<21>MEA+WT+AAE+KGM:700'<22>MEA+VOL+AAE+MTQ:5'<23>PCI++M&M0411086'<24>SGP+CONT0411060+0'<25>EQD+CN+CONT0411060+2060'<26>TMD+2:LCL/LCL+LCL'<27>SEL+seal041108+SH'<28>UNT
0
 

Author Comment

by:Pankaj_Sachdeva
ID: 22929034
We need to count the UNT segment delimeter as well. That will make it to be 29 then . All the segment delimeters i.e. ' needs to be counted for UNH till UNT end. inlcuding the UNT one as well..
 
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This Windows batch file is useful for organizing image files from a digital camera or other source, but can have many other uses.  It simply renames the file(s) to match their create date.  For example, if you took a picture today at 1:40pm and the …
Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
In a recent question (https://www.experts-exchange.com/questions/29004105/Run-AutoHotkey-script-directly-from-Notepad.html) here at Experts Exchange, a member asked how to run an AutoHotkey script (.AHK) directly from Notepad++ (aka NPP). This video…
Suggested Courses

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question