Solved

Split fixed length file into multiple files based on value in a position

Posted on 2014-03-10
8
729 Views
Last Modified: 2014-03-13
Hi
I want to split a fixed width file into multiple files based on a field which starts from position 3-8

file
kip56802
tim45607
skz56802
sam45607

  this has to be split into 2 files ,
file 1 would be
kip56802
skz56802

file 2 would be
tim45607
sam45607

thanks
0
Comment
Question by:uco
  • 4
  • 4
8 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39919316
Do you mean position 4-8, counting from 1?

Are there only two distinct values in that position? Are these values always "56802" or "45607"?
I don't believe so, but if the above is true then try this:

awk '{if(substr($1,4,5)~"56802") print >"file1"; if(substr($1,4,5)~"45607") print >"file2"}' file

We could also create output files named according to that "4-8" value, here e.g. "56802.out" and 45607.out":

awk '{print >substr($1,4,5)".out"}' file
0
 

Author Comment

by:uco
ID: 39919567
Yes that is correct I am counting from 1, there could be more than two values in that place and there could be  more than 2 distinct values
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39919914
So could my second solution above be an option?

awk '{print >substr($1,4,5)".out"}' file

Using the example you posted the above will create a file named "56802.out" containing

kip56802
skz56802

and a file "45607.out" containing

tim45607
sam45607

The more distinct strings in that position the more files will be created.

Basically we just take the string in positions 4-8 of each record plus a suffix ".out" to build output file names and write the appropriate records to the respective file.
0
 

Author Comment

by:uco
ID: 39922469
I am having small issuse here i.e  some times the  length may be 5 but it may have spaces ,
it may have spaces like
kip568
skz568
how do I remove spaces in file name

thanks
0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 
LVL 68

Expert Comment

by:woolmilkporc
ID: 39922998
This replaces the spaces with underscores ("_"):

awk '{A=substr($0,4,5); gsub(" ","_",A); print >A".out"}' file

and this eliminates them:

awk '{A=substr($0,4,5); gsub(" ","",A); print >A".out"}' file
0
 

Author Comment

by:uco
ID: 39925461
Thanks a lot , it did work, can we print to a different directory ? I am trying to print to different directory but it is not allowing , it says
fatal: division by zero attempted

awk '{A=substr($0,4,5); gsub(" ","",A); print > dirc/A".out"}' file

also I want to write the file names created to a file

thanks
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 39925891
awk -v D="dirc/" -v S=".out" -v LS=".splitlist.txt" '
    {A=substr($0,4,5); gsub(" ","",A); F[D A S]+=1; print >D A S}
    END {N=FILENAME; sub(/^.*\//, "", N); for(f in F) print f, "\tRecords:", F[f] >D N LS}
' file

I used variables for the output file suffix and the directory path because we need them more than once.

The list containing the filenames and a record count per file is "dirc/file.splitlist.txt".
The suffix ".splitlist.txt" has its own variable for maintainability.
0
 

Author Closing Comment

by:uco
ID: 39926631
Simpy excellent
0

Featured Post

Ransomware-A Revenue Bonanza for Service Providers

Ransomware – malware that gets on your customers’ computers, encrypts their data, and extorts a hefty ransom for the decryption keys – is a surging new threat.  The purpose of this eBook is to educate the reader about ransomware attacks.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Linux Scripting 3 102
Linux/Apache File Ownership/Permissions 1 53
Need some help with powershell script 5 36
Can't ping New Linux Servers 40 20
SSH (Secure Shell) - Tips and Tricks As you all know SSH(Secure Shell) is a network protocol, which we use to access/transfer files securely between two networked devices. SSH was actually designed as a replacement for insecure protocols that sen…
I. Introduction There's an interesting discussion going on now in an Experts Exchange Group — Attachments with no extension (http://www.experts-exchange.com/discussions/210281/Attachments-with-no-extension.html). This reminded me of questions tha…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
Video by: Mark
This lesson goes over how to construct ordered and unordered lists and how to create hyperlinks.

920 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now