uco
asked on
Split fixed length file into multiple files based on value in a position
Hi
I want to split a fixed width file into multiple files based on a field which starts from position 3-8
file
kip56802
tim45607
skz56802
sam45607
this has to be split into 2 files ,
file 1 would be
kip56802
skz56802
file 2 would be
tim45607
sam45607
thanks
I want to split a fixed width file into multiple files based on a field which starts from position 3-8
file
kip56802
tim45607
skz56802
sam45607
this has to be split into 2 files ,
file 1 would be
kip56802
skz56802
file 2 would be
tim45607
sam45607
thanks
ASKER
Yes that is correct I am counting from 1, there could be more than two values in that place and there could be more than 2 distinct values
So could my second solution above be an option?
awk '{print >substr($1,4,5)".out"}' file
Using the example you posted the above will create a file named "56802.out" containing
kip56802
skz56802
and a file "45607.out" containing
tim45607
sam45607
The more distinct strings in that position the more files will be created.
Basically we just take the string in positions 4-8 of each record plus a suffix ".out" to build output file names and write the appropriate records to the respective file.
awk '{print >substr($1,4,5)".out"}' file
Using the example you posted the above will create a file named "56802.out" containing
kip56802
skz56802
and a file "45607.out" containing
tim45607
sam45607
The more distinct strings in that position the more files will be created.
Basically we just take the string in positions 4-8 of each record plus a suffix ".out" to build output file names and write the appropriate records to the respective file.
ASKER
I am having small issuse here i.e some times the length may be 5 but it may have spaces ,
it may have spaces like
kip568
skz568
how do I remove spaces in file name
thanks
it may have spaces like
kip568
skz568
how do I remove spaces in file name
thanks
This replaces the spaces with underscores ("_"):
awk '{A=substr($0,4,5); gsub(" ","_",A); print >A".out"}' file
and this eliminates them:
awk '{A=substr($0,4,5); gsub(" ","",A); print >A".out"}' file
awk '{A=substr($0,4,5); gsub(" ","_",A); print >A".out"}' file
and this eliminates them:
awk '{A=substr($0,4,5); gsub(" ","",A); print >A".out"}' file
ASKER
Thanks a lot , it did work, can we print to a different directory ? I am trying to print to different directory but it is not allowing , it says
fatal: division by zero attempted
awk '{A=substr($0,4,5); gsub(" ","",A); print > dirc/A".out"}' file
also I want to write the file names created to a file
thanks
fatal: division by zero attempted
awk '{A=substr($0,4,5); gsub(" ","",A); print > dirc/A".out"}' file
also I want to write the file names created to a file
thanks
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Simpy excellent
Are there only two distinct values in that position? Are these values always "56802" or "45607"?
I don't believe so, but if the above is true then try this:
awk '{if(substr($1,4,5)~"56802
We could also create output files named according to that "4-8" value, here e.g. "56802.out" and 45607.out":
awk '{print >substr($1,4,5)".out"}' file