Solved

fixed width file problem

Posted on 2008-06-19
20
301 Views
Last Modified: 2013-12-06
how to do the following in perl or awk, if state is MN then insert value "C" into code field, if state is TX then insert "T" in code field.  if state doesn't match any critiera than put "O" into code filed.  the file is fixed width.
 
fname  lname   st         code
chris    bauer    MN            
john     smith     TX              
jim       brown    AL    
 
thanks,
0
Comment
Question by:bodi77
  • 9
  • 8
  • 2
  • +1
20 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 21821159
perl -pe '(2..0)&&s/((MN)|TX)?[^\n\w]*$/$1\t${[qw(C T O)]}[!$1+!$2]/' file
0
 

Author Comment

by:bodi77
ID: 21821276
it could be, but how does it know what fixed postion to look for the MN and TX?  This is a fixed width file.   it has to be certian postion.  
0
 
LVL 9

Expert Comment

by:ghostdog74
ID: 21821713
awk:
awk 'NR>1 && $3 == "MN"{ printf "%s\tC\n",$0;next }

     NR>1 && $3 == "TX"{ printf "%s\tO\n",$0;next }

     $3 !~ /MX|TX/ { print }

' file

Open in new window

0
 
LVL 84

Expert Comment

by:ozo
ID: 21824476
What position does it have to be in?
0
 
LVL 16

Expert Comment

by:Hanno Schröder
ID: 21829040
Even if the file is "fixed width", it's easier just to work on the colum number.
Therefore, the Perl and awk solutions simply take the first three words and
add a fourth (C, T or O).
The (default) field separators are "white spaces", blank(s) and tab(s)
0
 

Author Comment

by:bodi77
ID: 21829356
that's a problem because the data was just and exmaple, between some columns there is no space.  
so it would look like this
john            smith    mn  
christopherbauer    tx
bill     swcharinagormn
0
 
LVL 84

Expert Comment

by:ozo
ID: 21829561
perl -pe '(2..0)&&s/(?<=^.{20})(((MN)|TX)|..)/$1\t${[qw(C T O)]}[!$2+!$3]/i' file
0
 

Author Comment

by:bodi77
ID: 21829624
that works, but have another question about it.  
so I want MN, IA, WI  to = C
and TX, AL, NV to = T
and anything else to = O

I wrote an entire script that has 60 lines to acomplish what you are doing in one line.  can you explain what it is you are wrting?  thanks
0
 

Author Comment

by:bodi77
ID: 21829639
ozo,
I guess I only need to know a few things about your one liner.  

perl -pe '(2..0)&&s/(?<=^.{20})(((MN)|TX)|..)/$1\t${[qw(C T O)]}[!$2+!$3]/i' file
the 20 is the state field starts, where do you tell it how long the field is?
where is it that you are defining what postion to put the C, T , O

thanks,
0
 

Author Comment

by:bodi77
ID: 21829710
to clerify,  read postion 20, two characters.  If the states are MN, IA or WI  write to postion 30, one character, a "C".   if states are TX, AL, NV write into postion 30 a T
and any other vaule in position 20, two characters,  write and "O" into postion 30.
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 84

Expert Comment

by:ozo
ID: 21830160
perl -pe '(2..0)&&s/(?<=^.{20})(((MN|IA|WI)|TX|AL|NV)|..)(.{8}).?/$1$4${[qw(C T O)]}[!$2+!$3]/i' file
Assuming there is already something in positions 22-29
0
 
LVL 84

Expert Comment

by:ozo
ID: 21830179
Also assuming that you don't want to add the letter to the first line
fname  lname   st         code
0
 

Author Comment

by:bodi77
ID: 21830648
first row was only for an exmaple, so yes I do want to do the first row in production.
and I will be using this for multiple files with different formats, the only thing I will need to change is the postion of the lookup and the postion of the insert.  I think the postion of the lookup is the 20 in your one liner.  but do you define the postion of the insert (postion 30)  there will actully be data after the postion 31.  sorry if I have not been clear.  

01234567890123456789012345678901234567890
christophersbauer          MNxxxxxxxxT9559552087
0
 
LVL 9

Expert Comment

by:ghostdog74
ID: 21831749

awk 'BEGIN{OFS=FS=""}

($19=="M" && $20=="N") || ( $19=="W" && $20=="I" ) || ( $19=="I" && $20=="A" ){

    $30="C"

}

($19=="T" && $20=="X") || ( $19=="A" && $20=="L" ) || ( $19=="N" && $20=="V" ){

    $30="T"

}

1' file

Open in new window

0
 
LVL 84

Expert Comment

by:ozo
ID: 21832545
then omit the (2..0)&&
perl -pe 'substr($_,30,1)=${[qw(O T C)]}[grep $_,substr($_,20,2)=~/((MN|IA|WI)|TX|AL|NV)/]' file
0
 

Author Comment

by:bodi77
ID: 21833387
ozo the logic works great but how can I get it to change the values in the file instead of displaying on the screen.  

thanks,
C
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 21833424
perl -i.bak -pe 'substr($_,30,1)=${[qw(O T C)]}[grep $_,substr($_,20,2)=~/((MN|IA|WI)|TX|AL|NV)/]' file
0
 

Author Comment

by:bodi77
ID: 21833701
ozo, that's amazing!!! thanks alot.  
0
 

Author Comment

by:bodi77
ID: 21853393
Ozo, the script worked great for a test file, but when I tried on a production file that has 300,000 records.  I recieved a out of memory error

and also will the script work a a ebsidec file?
0
 
LVL 84

Expert Comment

by:ozo
ID: 21853526
records are separated by a "\n" in ascii, which is a byte with value 0x0A which is RPT in ebcdic
If that character is not found, it will try to read the entire file as as single record
you could redefine the record separator and change the characters in the qw and // to the appropriate ebcdic characters, or see some of the considerations in
perldoc perlebcdic
if the records are fixed length, perhaps it doesn't have a special character to separate records,
in which case you might set
$/ = \lengthofrecord
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now