I need a Regex for SED string replace across multiple ascii files

NAEDI2
NAEDI2 used Ask the Experts™
on
I need SED to replace a string across multiple ASCII text files based on a regex pattern.  These files are descriptions of EDI file formats with records described in this format:

E2EDPT2 {  DELIMITER="\x0a" }:
    E2EDPT2_SEGNAM		# Segment E2EDPT2
    E2EDPT2_MANDT
    E2EDPT2_DOCNUM
    E2EDPT2_SEGNUM
    E2EDPT2_PSGNUM
    E2EDPT2_HLEVEL
    E2EDPT2_TDLINE
    E2EDPT2_TDFORMAT
    rest[0:"^\x0a"]*
;

E2EDPT2_SEGNAM: STRINGA { LENGTH=30, DEFAULT="E2EDPT2001                    " };
E2EDPT2_MANDT: NSTRINGA { LENGTH=3 };
E2EDPT2_DOCNUM: STRINGA { LENGTH=16 };
E2EDPT2_SEGNUM: STRINGA { LENGTH=6 };
E2EDPT2_PSGNUM: STRINGA { LENGTH=6 };
E2EDPT2_HLEVEL: STRINGA { LENGTH=2 };
E2EDPT2_TDLINE: STRINGA { LENGTH=70, MISSVALUE="                                                                      " };
E2EDPT2_TDFORMAT: STRINGA { LENGTH=2, MISSVALUE="  " };

Open in new window


What I need to do is replace the occurrences of fields like "E2EDPT2_SEGNUM" above with plain "SEGNUM".  All record types will have this field and I want to lose the record name part leaving just the fieldname SEGNUM.  The recordname part will always begin with a fixed length "E2EDXXX-".  What is the correct regular expression to get this done?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Bill PrewIT / Software Engineering Consultant
Top Expert 2016

Commented:
Do you want to remove all the "E2EDPT2_" in front of any field name, or only the one field "E2EDPT2_SEGNUM"?


»bp
Bill PrewIT / Software Engineering Consultant
Top Expert 2016

Commented:
If you just want that one specific field "E2EDPT2_SEGNUM" then this should do that:

sed s/E2ED..._SEGNUM/SEGNUM/g test.txt>new.txt

Open in new window


»bp
NAEDI2SP. EDI Business Analyst

Author

Commented:
It is not the one specific one I need replaced and that's the complication.  I want to remove every E2ED***_ prefix where the "***" is a wildcard for any value.  

E2EDK03_SEGNUM
E2EDKA1_SEGNUM
E2EDP01_SEGNUM
E2EDP02_SEGNUM
E2EDPT1_SEGNUM
E2EDPT2_SEGNUM
etc..
Learn Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

Bill PrewIT / Software Engineering Consultant
Top Expert 2016

Commented:
Try this then:

sed s/E2ED..._//g test.txt>new.txt

Open in new window


»bp
NAEDI2SP. EDI Business Analyst

Author

Commented:
We are close but we removed the prefix in places we didn't want to.  I think this image of the before and after diff will clarify everything.
2018-11-06_151926.pdf

I need to lose the prefix from only the _SEGNUM field.

Thanks again.
IT / Software Engineering Consultant
Top Expert 2016
Commented:
That is what my first suggestion did...

http://#a42724656


»bp
NAEDI2SP. EDI Business Analyst

Author

Commented:
There are many prefixes we need to remove and I need to lose these prefixes only on the _SEGNUM field.  All the other fields keep the E2EDLXX_ prefix.  That prefix is variable which is what makes it tricky.

 E2EDK03_SEGNUM
 E2EDKA1_SEGNUM
 E2EDP01_SEGNUM
 E2EDP02_SEGNUM
 E2EDPT1_SEGNUM
 E2EDPT2_SEGNUM
Bill PrewIT / Software Engineering Consultant
Top Expert 2016

Commented:
Right, that's what it does.  DId you try it?


»bp
murugesandinsShell_script Automation /bin/bash /bin/bash.exe /bin/ksh /bin/mksh.exe AIX C C++ CYGWIN_NT HP-UX Linux MINGW32 MINGW64 SunOS Windows_NT

Commented:
@NAEDI2

>> sed s/E2ED..._//g test.txt
The format being used here was
Replace:
/E2ED..._/ => E2ED following by any three characters (...) followed by underscore
with
// => empty string
Hence this will replace:
a)
E2EDP02_SEGNUM to SEGNUM
b)
E2EDPT2_OTHERSTRING to OTHERSTRING
Input file:
$ /bin/cat 2018-11-06_151926.txt
        E2EDK01 { DELIMITER="\x0a" }:
                E2EDK01_SEGNAM
                E2EDK01_MANDT
                E2EDK01_DOCNUM
                E2EDK01_SEGNUM
                E2EDK01_PSGNUM
                E2EDK01_HLEVEL
                E2EDK01_ACTION
                E2EDK01_KZABS
                E2EDK01_CURCY
                E2EDK01_HWAER
                E2EDK01_WKURS
                E2EDK01_ZTERM
                E2EDK01_KUNDEUINR
                E2EDK01_EIGENUINR
                E2EDK01_BSART
                E2EDK01_BELNR
                E2EDK01_NTGEW
                E2EDK01_BRGEW
                E2EDK01_GEWEI
                E2EDK01_FKART_RL
                E2EDK03_SEGNUM
                E2EDKA1_SEGNUM
                E2EDP01_SEGNUM
                E2EDP02_SEGNUM
                E2EDPT1_SEGNUM
                E2EDPT2_SEGNUM
                E2EDPT2_EXCEPTION

Open in new window


2)
You can use following command to replace E2EDXXX_SEGNUM to SEGNUM.
Here output is redirected to current terminal
$ /bin/sed "s/E2ED..._SEGNUM/SEGNUM/g;"  2018-11-06_151926.txt
        E2EDK01 { DELIMITER="\x0a" }:
                E2EDK01_SEGNAM
                E2EDK01_MANDT
                E2EDK01_DOCNUM
                SEGNUM
                E2EDK01_PSGNUM
                E2EDK01_HLEVEL
                E2EDK01_ACTION
                E2EDK01_KZABS
                E2EDK01_CURCY
                E2EDK01_HWAER
                E2EDK01_WKURS
                E2EDK01_ZTERM
                E2EDK01_KUNDEUINR
                E2EDK01_EIGENUINR
                E2EDK01_BSART
                E2EDK01_BELNR
                E2EDK01_NTGEW
                E2EDK01_BRGEW
                E2EDK01_GEWEI
                E2EDK01_FKART_RL
                SEGNUM
                SEGNUM
                SEGNUM
                SEGNUM
                SEGNUM
                SEGNUM
                SEGNUM NEXT_STRING at same line SEGNUM
                E2EDPT2_EXCEPTION

Open in new window

3)
You can take a backup of original file:
$ echo /bin/cp -i 2018-11-06_151926.txt 2018-11-06_151926$(/usr/bin/stat 2018-11-06_151926.txt | /bin/grep Change | /bin/sed "s/Change: //;s/\-/_/g;s/ /_/g;s/:/_/g;s/\./_/;s/\+//;")".txt"
/bin/cp -i 2018-11-06_151926.txt 2018-11-06_1519262018_11_07_14_10_10_176655100_0530.txt
$ /bin/cp -i 2018-11-06_151926.txt 2018-11-06_1519262018_11_07_14_10_10_176655100_0530.txt

Open in new window

We can replace original file content based on the requirement:
$ /bin/sed -i "s/E2ED..._SEGNUM/SEGNUM/g;"  2018-11-06_151926.txt
$ echo Here goes related updates:
Here goes related updates:
$ /usr/bin/diff 2018-11-06_1519262018_11_07_14_10_10_176655100_0530.txt 2018-11-06_151926.txt
5c5
<               E2EDK01_SEGNUM
---
>               SEGNUM
22,28c22,28
<               E2EDK03_SEGNUM
<               E2EDKA1_SEGNUM
<               E2EDP01_SEGNUM
<               E2EDP02_SEGNUM
<               E2EDPT1_SEGNUM
<               E2EDPT2_SEGNUM
<               E2EDPT2_SEGNUM NEXT_STRING at same line E2EDK03_SEGNUM
---
>               SEGNUM
>               SEGNUM
>               SEGNUM
>               SEGNUM
>               SEGNUM
>               SEGNUM
>               SEGNUM NEXT_STRING at same line SEGNUM

Open in new window

Original file content:
$ /bin/cat 2018-11-06_1519262018_11_07_14_10_10_176655100_0530.txt
        E2EDK01 { DELIMITER="\x0a" }:
                E2EDK01_SEGNAM
                E2EDK01_MANDT
                E2EDK01_DOCNUM
                E2EDK01_SEGNUM
                E2EDK01_PSGNUM
                E2EDK01_HLEVEL
                E2EDK01_ACTION
                E2EDK01_KZABS
                E2EDK01_CURCY
                E2EDK01_HWAER
                E2EDK01_WKURS
                E2EDK01_ZTERM
                E2EDK01_KUNDEUINR
                E2EDK01_EIGENUINR
                E2EDK01_BSART
                E2EDK01_BELNR
                E2EDK01_NTGEW
                E2EDK01_BRGEW
                E2EDK01_GEWEI
                E2EDK01_FKART_RL
                E2EDK03_SEGNUM
                E2EDKA1_SEGNUM
                E2EDP01_SEGNUM
                E2EDP02_SEGNUM
                E2EDPT1_SEGNUM
                E2EDPT2_SEGNUM
                E2EDPT2_SEGNUM NEXT_STRING at same line E2EDK03_SEGNUM
                E2EDPT2_EXCEPTION

Open in new window

/bin/sed -i
-i option making /bin/sed to update the same input file.
Updated file content due to the same:
$ /bin/cat 2018-11-06_151926.txt
        E2EDK01 { DELIMITER="\x0a" }:
                E2EDK01_SEGNAM
                E2EDK01_MANDT
                E2EDK01_DOCNUM
                SEGNUM
                E2EDK01_PSGNUM
                E2EDK01_HLEVEL
                E2EDK01_ACTION
                E2EDK01_KZABS
                E2EDK01_CURCY
                E2EDK01_HWAER
                E2EDK01_WKURS
                E2EDK01_ZTERM
                E2EDK01_KUNDEUINR
                E2EDK01_EIGENUINR
                E2EDK01_BSART
                E2EDK01_BELNR
                E2EDK01_NTGEW
                E2EDK01_BRGEW
                E2EDK01_GEWEI
                E2EDK01_FKART_RL
                SEGNUM
                SEGNUM
                SEGNUM
                SEGNUM
                SEGNUM
                SEGNUM
                SEGNUM NEXT_STRING at same line SEGNUM
                E2EDPT2_EXCEPTION

Open in new window

NAEDI2SP. EDI Business Analyst

Author

Commented:
I must have screwed up running it somehow.  I went back to view the wrong results I got the first time and all prefixes were gone.  But I tried it again and of course it works .  Sorry to waste your time with the extra work but thanks for the answer.
Bill PrewIT / Software Engineering Consultant
Top Expert 2016
Commented:
Great, glad it helped.


»bp

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial