Unix: reformat a file so that it can be awk-ed

Dear Experts,

I need a file in irregular format fixed so that it can awk it.
The original file looks like below:

CD0E0100_H000 0002D0508F CD0E0100_H0003010101013MSPROXMLT 01JTN VJ 00M 00A 00NNT_TH56CEXPN
FT0Y3600_H000 0013D0508F FT0Y3600_H0003120112013MSPROXMLT 01JTN VJ 01M 01A 01NNT_TH56CEXPO
MSPROXMLH 000 0002D0508F MSPROXMLH 0003010101013MSUPDSEOT 01JTN VJ 01M 01A 01NN O
MSPROXMLH 000 0003D0508F MSPROXMLH 0003020102013VSRFUMVTT 01JTN VJ 01M 01A 01NN O
MSPROXMLH 000 0004D0508F MSPROXMLH 0003030103013FT0Y6600_T01JTN VJ 01M 01A 01NNP_FT56CEXPO
MSPROXMLH 000 0005D0508F MSPROXMLH 0003040104013MSINSMSGT 01JTN V 00 00 00NN
PC0R3420_H000 0002D0508F PC0R3420_H0003010101013MSPROXMLT 01JTN V 00 00 00NNT_TH56CEXP
WO0A0020_H000 0002D0508F WO0A0020_H0003010101013MSPROXMLT 01JTN VJ 01M 01A 01NNT_TH56CEXPO
XGGENEXTH 000 0002D0508F XGGENEXTH 0003010101013MSPROXMLT 01JTN VJ 00M 00A 00NN O

Open in new window


It should be transformed to the format below...

CD0E0100_H000 0002D0508F CD0E0100_H0003010101013MSPROXMLT 01JTN VJ 00M 00A 00NNT_TH56CEXPN
FT0Y3600_H000 0013D0508F FT0Y3600_H0003120112013MSPROXMLT 01JTN VJ 01M 01A 01NNT_TH56CEXPO
MSPROXMLH_000 0002D0508F MSPROXMLH_0003010101013MSUPDSEOT 01JTN VJ 01M 01A 01NN O
MSPROXMLH_000 0003D0508F MSPROXMLH_0003020102013VSRFUMVTT 01JTN VJ 01M 01A 01NN O
MSPROXMLH_000 0004D0508F MSPROXMLH_003030103013FT0Y6600_T 01JTN VJ 01M 01A 01NNP_FT56CEXPO
MSPROXMLH_000 0005D0508F MSPROXMLH_0003040104013MSINSMSGT 01JTN V 00 00 00NN
PC0R3420_H000 0002D0508F PC0R3420_H0003010101013MSPROXMLT 01JTN V 00 00 00NNT_TH56CEXP
WO0A0020_H000 0002D0508F WO0A0020_H0003010101013MSPROXMLT 01JTN VJ 01M 01A 01NNT_TH56CEXPO
XGGENEXTH_000 0002D0508F XGGENEXTH_0003010101013MSPROXMLT 01JTN VJ 00M 00A 00NN O

Open in new window


It comes down to creating a first column of 13 digits,  a second one of 10, a third one of 32, a fourth one of 5.
Lines 1 and 2 represent how the rest of the lines need to look after the conversion.

Can you have a look please?
As always, many thanks and cheers.
Watnog
WatnogAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

arnoldCommented:
I would forget awk and use perl.

Try the following,
cat file -| perl -e 'while (<STDIN>) {
chomp();
$front=substr($_,0,16);
$back=substr($_,17,len ($_)-16);
$front=~ s/ /_/;
print "$front$back\n";
}'

The part of importance is to grab the first group, the substring should not capture the space after the first element.

Double check and adjust the substring of front and back to make sure only the possible space in the first 16 characters.
0
WatnogAuthor Commented:
Thanks Arnold.

I get this error

"Undefined subroutine &main::len called at -e line 4, <STDIN> line 1."

W.
0
arnoldCommented:
Try length.

The issue us that your last field is not uniform in terms

If it was uniform, the simple approach would be to strip the underscores and then reformat

Cat file | sed -e 's/_/ /g'  | awk ' { print $1"_"$2" " $3" "$4"_"$5,$6,$7,$8,$9,$10,$11 } '

If you gave rules, perl usin strip out all underscores, use split on white space, and reassembly the line based on the rules.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
WatnogAuthor Commented:
Thank you Arnold. Glad you could help me with the perl solution as it gives me the best result.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
sed

From novice to tech pro — start learning today.