Solved

Split fixed length file into multiple files based on value in a position

Posted on 2014-03-10
8
721 Views
Last Modified: 2014-03-13
Hi
I want to split a fixed width file into multiple files based on a field which starts from position 3-8

file
kip56802
tim45607
skz56802
sam45607

  this has to be split into 2 files ,
file 1 would be
kip56802
skz56802

file 2 would be
tim45607
sam45607

thanks
0
Comment
Question by:uco
  • 4
  • 4
8 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39919316
Do you mean position 4-8, counting from 1?

Are there only two distinct values in that position? Are these values always "56802" or "45607"?
I don't believe so, but if the above is true then try this:

awk '{if(substr($1,4,5)~"56802") print >"file1"; if(substr($1,4,5)~"45607") print >"file2"}' file

We could also create output files named according to that "4-8" value, here e.g. "56802.out" and 45607.out":

awk '{print >substr($1,4,5)".out"}' file
0
 

Author Comment

by:uco
ID: 39919567
Yes that is correct I am counting from 1, there could be more than two values in that place and there could be  more than 2 distinct values
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39919914
So could my second solution above be an option?

awk '{print >substr($1,4,5)".out"}' file

Using the example you posted the above will create a file named "56802.out" containing

kip56802
skz56802

and a file "45607.out" containing

tim45607
sam45607

The more distinct strings in that position the more files will be created.

Basically we just take the string in positions 4-8 of each record plus a suffix ".out" to build output file names and write the appropriate records to the respective file.
0
 

Author Comment

by:uco
ID: 39922469
I am having small issuse here i.e  some times the  length may be 5 but it may have spaces ,
it may have spaces like
kip568
skz568
how do I remove spaces in file name

thanks
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39922998
This replaces the spaces with underscores ("_"):

awk '{A=substr($0,4,5); gsub(" ","_",A); print >A".out"}' file

and this eliminates them:

awk '{A=substr($0,4,5); gsub(" ","",A); print >A".out"}' file
0
 

Author Comment

by:uco
ID: 39925461
Thanks a lot , it did work, can we print to a different directory ? I am trying to print to different directory but it is not allowing , it says
fatal: division by zero attempted

awk '{A=substr($0,4,5); gsub(" ","",A); print > dirc/A".out"}' file

also I want to write the file names created to a file

thanks
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 39925891
awk -v D="dirc/" -v S=".out" -v LS=".splitlist.txt" '
    {A=substr($0,4,5); gsub(" ","",A); F[D A S]+=1; print >D A S}
    END {N=FILENAME; sub(/^.*\//, "", N); for(f in F) print f, "\tRecords:", F[f] >D N LS}
' file

I used variables for the output file suffix and the directory path because we need them more than once.

The list containing the filenames and a record count per file is "dirc/file.splitlist.txt".
The suffix ".splitlist.txt" has its own variable for maintainability.
0
 

Author Closing Comment

by:uco
ID: 39926631
Simpy excellent
0

Featured Post

What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

Join & Write a Comment

Setting up Secure Ubuntu server on VMware 1.      Insert the Ubuntu Server distribution CD or attach the ISO of the CD which is in the “Datastore”. Note that it is important to install the x64 edition on servers, not the X86 editions. 2.      Power on th…
SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
The viewer will learn how to dynamically set the form action using jQuery.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now