Solved

Split fixed length file into multiple files based on value in a position

Posted on 2014-03-10
8
799 Views
Last Modified: 2014-03-13
Hi
I want to split a fixed width file into multiple files based on a field which starts from position 3-8

file
kip56802
tim45607
skz56802
sam45607

  this has to be split into 2 files ,
file 1 would be
kip56802
skz56802

file 2 would be
tim45607
sam45607

thanks
0
Comment
Question by:uco
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
8 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39919316
Do you mean position 4-8, counting from 1?

Are there only two distinct values in that position? Are these values always "56802" or "45607"?
I don't believe so, but if the above is true then try this:

awk '{if(substr($1,4,5)~"56802") print >"file1"; if(substr($1,4,5)~"45607") print >"file2"}' file

We could also create output files named according to that "4-8" value, here e.g. "56802.out" and 45607.out":

awk '{print >substr($1,4,5)".out"}' file
0
 

Author Comment

by:uco
ID: 39919567
Yes that is correct I am counting from 1, there could be more than two values in that place and there could be  more than 2 distinct values
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39919914
So could my second solution above be an option?

awk '{print >substr($1,4,5)".out"}' file

Using the example you posted the above will create a file named "56802.out" containing

kip56802
skz56802

and a file "45607.out" containing

tim45607
sam45607

The more distinct strings in that position the more files will be created.

Basically we just take the string in positions 4-8 of each record plus a suffix ".out" to build output file names and write the appropriate records to the respective file.
0
Free learning courses: Active Directory Deep Dive

Get a firm grasp on your IT environment when you learn Active Directory best practices with Veeam! Watch all, or choose any amount, of this three-part webinar series to improve your skills. From the basics to virtualization and backup, we got you covered.

 

Author Comment

by:uco
ID: 39922469
I am having small issuse here i.e  some times the  length may be 5 but it may have spaces ,
it may have spaces like
kip568
skz568
how do I remove spaces in file name

thanks
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39922998
This replaces the spaces with underscores ("_"):

awk '{A=substr($0,4,5); gsub(" ","_",A); print >A".out"}' file

and this eliminates them:

awk '{A=substr($0,4,5); gsub(" ","",A); print >A".out"}' file
0
 

Author Comment

by:uco
ID: 39925461
Thanks a lot , it did work, can we print to a different directory ? I am trying to print to different directory but it is not allowing , it says
fatal: division by zero attempted

awk '{A=substr($0,4,5); gsub(" ","",A); print > dirc/A".out"}' file

also I want to write the file names created to a file

thanks
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 39925891
awk -v D="dirc/" -v S=".out" -v LS=".splitlist.txt" '
    {A=substr($0,4,5); gsub(" ","",A); F[D A S]+=1; print >D A S}
    END {N=FILENAME; sub(/^.*\//, "", N); for(f in F) print f, "\tRecords:", F[f] >D N LS}
' file

I used variables for the output file suffix and the directory path because we need them more than once.

The list containing the filenames and a record count per file is "dirc/file.splitlist.txt".
The suffix ".splitlist.txt" has its own variable for maintainability.
0
 

Author Closing Comment

by:uco
ID: 39926631
Simpy excellent
0

Featured Post

Ready to get started with anonymous questions?

It's easy! Check out this step-by-step guide for asking an anonymous question on Experts Exchange.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Batch, VBS, and scripts in general are incredibly useful for repetitive tasks.  Some tasks can take a while to complete and it can be annoying to check back only to discover that your script finished 5 minutes ago.  Some scripts may complete nearly …
FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

636 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question