asked on

Split fixed length file into multiple files based on value in a position

Hi
I want to split a fixed width file into multiple files based on a field which starts from position 3-8

file
kip56802
tim45607
skz56802
sam45607

this has to be split into 2 files ,
file 1 would be
kip56802
skz56802

file 2 would be
tim45607
sam45607

thanks

woolmilkporc

Do you mean position 4-8, counting from 1?

Are there only two distinct values in that position? Are these values always "56802" or "45607"?
I don't believe so, but if the above is true then try this:

awk '{if(substr($1,4,5)~"56802") print >"file1"; if(substr($1,4,5)~"45607") print >"file2"}' file

We could also create output files named according to that "4-8" value, here e.g. "56802.out" and 45607.out":

awk '{print >substr($1,4,5)".out"}' file

uco

ASKER

Yes that is correct I am counting from 1, there could be more than two values in that place and there could be more than 2 distinct values

woolmilkporc

So could my second solution above be an option?

awk '{print >substr($1,4,5)".out"}' file

Using the example you posted the above will create a file named "56802.out" containing

kip56802
skz56802

and a file "45607.out" containing

tim45607
sam45607

The more distinct strings in that position the more files will be created.

Basically we just take the string in positions 4-8 of each record plus a suffix ".out" to build output file names and write the appropriate records to the respective file.

uco

ASKER

I am having small issuse here i.e some times the length may be 5 but it may have spaces ,
it may have spaces like
kip568
skz568
how do I remove spaces in file name

thanks

woolmilkporc

This replaces the spaces with underscores ("_"):

awk '{A=substr($0,4,5); gsub(" ","_",A); print >A".out"}' file

and this eliminates them:

awk '{A=substr($0,4,5); gsub(" ","",A); print >A".out"}' file

uco

ASKER

Thanks a lot , it did work, can we print to a different directory ? I am trying to print to different directory but it is not allowing , it says
fatal: division by zero attempted

awk '{A=substr($0,4,5); gsub(" ","",A); print > dirc/A".out"}' file

also I want to write the file names created to a file

thanks

ASKER CERTIFIED SOLUTION

woolmilkporc

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

uco

ASKER

Simpy excellent