• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 2646
  • Last Modified:

unix script using sed or awk command on a fixed width file to replace data

This is kind complex but I have this fixed width txt file. Each line is a new record. For each record, there is a number between bytes 1-4 that will change what is in bytes 10-13

So let's say if I find 0001 then I want to replace bytes 10-13 with the word "hope"

likewise if I find 0002 then I want the word "life"

Make sense? The codes below seems to do something similar, but I don't understand how it works or how I can manipulate it to my situation. My guess is I will have to add an IF statement to read bytes 1-4, compare it to a string, and if it matches, call one of the below commands. Any help would be much appreciated.
echo '123 4567' | sed 's/^\([^ ]* *\)..\(.*\)/\1\2/'
 
echo '123 4567' | awk '$2=substr($2,3)'

Open in new window

0
MeridianManagement
Asked:
MeridianManagement
  • 5
  • 4
  • 3
  • +1
4 Solutions
 
mahomeCommented:
I don't know what exactly you want. For IF look at the following example.


$ cat in.txt 
----------------
0001 first line
0002 second line
 
==================================
 
$ awk '$1 == "0001" {  $3 = "hope"; print  } $1 == "0002" { $3 = "life"; print }' in.txt
0001 first hope
0002 second life

Open in new window

0
 
Brian UtterbackPrinciple Software EngineerCommented:
For fixed position records, you are much better off using perl than sed or awk. Here is a script that does what you want/.
replace.txt
0
 
Maciej SsysadminCommented:
Assuming you have data in file.txt
sed '/^0001/s/^\(.\{9\}\).\{4\}/\1hope/;/^0002/s/^\(.\{9\}\).\{4\}/\1life/' file.txt

Open in new window

0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Maciej SsysadminCommented:
Little explanation:
/^0001/ means, that following command (s/../../) is to be performed on lines starting with 0001.
s/from/to/
.\{9\}.\{4\} means - match any nine characters (as you want to change 10th, 11th, 12th and 13th) and then any 4 characters. .\{9\} is withing parenthesis for later reference.
\1hope means - change matched string to: first referenced matching (.\{9\}) and then "hope" string. So - first 9 characters will be left untouched, and next 4 characters will be replaced with "hope".
Then you have semicolon, which separates commands in sed - after that is similar one, but to match every line starting with 0002, and change 10-13 characters to "life".
0
 
MeridianManagementAuthor Commented:
blu,

I was unable to get your script to work. II know sh and ksh shell scripting is nearly universal for unix, but I don't know if all my clients will be using PERL unless you know otherwise.

oklit,

your command sort of worked, however, the record doesn't start with 0001. when I say bytes 1-4, I'm skipping byte 0 which has another value in it. Anotherwords, the line doesn't start with 0001.

see below examples for more info
in.txt
 
10001          desc   bunchofotherdataiwantpreserved
10002          desc   bunchofotherdataiwantpreserved
10001          desc   bunchofotherdataiwantpreserved
10009          desc   bunchofotherdataiwantpreserved
10007          desc   bunchofotherdataiwantpreserved
10001          desc   bunchofotherdataiwantpreserved
 
out.txt
 
10001          hope   bunchofotherdataiwantpreserved
10002          life   bunchofotherdataiwantpreserved
10001          hope   bunchofotherdataiwantpreserved
10009          desc   bunchofotherdataiwantpreserved
10007          desc   bunchofotherdataiwantpreserved
10001          hope   bunchofotherdataiwantpreserved

Open in new window

0
 
MeridianManagementAuthor Commented:
anotherwords oklit,

if the sed can be adapted to read position 1-4 and check if it has 0001, instead of checking position 0-3 for 0001 then your command works perfectly.
0
 
mahomeCommented:
According to your file:

$ cat if.awk
-------------
{ 
 if ($1 ~ /.0001/) {  $2 = "hope"; print  } 
 else if ($1 ~ /.0002/) { $2 = "life"; print }
 else { print }
}
============
$ awk -f if.awk in.txt > out.txt

Open in new window

0
 
Maciej SsysadminCommented:
Just replace ^0001 with ^.0001 (and change \{9\} to \{10\} as I was numbering this from 1, not from 0). Rest (\{4\}) is ok.
0
 
mahomeCommented:
Or using olkits variant, which has the advantage to preserve spaces:

sed '/^.0001/s/^\(.\{15\}\).\{4\}/\1hope/;/^.0002/s/^\(.\{15\}\).\{4\}/\1life/' in.txt

Open in new window

0
 
MeridianManagementAuthor Commented:
oklit,

thank you! very very last question, if that 0001 number happens to start several bytes later like say, 4 bytes over, would I just add more periods or is there a shortcut?

this is just so I understand the command you wrote in case I need to modify it for future use
0
 
Maciej SsysadminCommented:
Exactly. Dot means any character. If you want to search for 0001 in "abcd0001efg", you may use "^....0001" (or - especially, if you have more characters: "^.\{4\}0001" - 4 here means 4 occurrences of any character).
0
 
MeridianManagementAuthor Commented:
excellent oklit very very good, and thanks for the alternatives mahome
0
 
Brian UtterbackPrinciple Software EngineerCommented:
You have Solaris in the keywords, so I assumed that you are using Solaris. Solaris comes with perl installed in the core
packages and removing it breaks many system commands and is unsupported, so if you are using Solaris, then it is safe
to say that perl is installed.

Going the sed route seems cumbersome to me, particularly if you have more susbstitutions than you gave, i.e 50.

I think the problem with my script may have been the same as everyone else, I assumed you started counting from 1.

Try this one.
replace.txt
0
 
MeridianManagementAuthor Commented:
blu,

thanks for the followup, unfortunately, I already gave the points but I definitely agree your code is more flexible, I just had concerns about the perl compatibility and the fact that the script bombed on me.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 5
  • 4
  • 3
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now