[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Split a file in UNIX into seperate files based on a delimiter character.

Posted on 2007-10-10
15
Medium Priority
?
2,988 Views
Last Modified: 2013-11-17
Hi Experts,
How would you split a file (input.txt)  into seperate files (output.txtn) based on the delimiter $ only.

input.txt -
{headertext}{moreheadertext}{id:somecontent;}{11{22}{33}}${headertext2}{moreheadertext}{id:somecontent2;}{11{22}{33}}${headertext3}{moreheadertext}{id:somecontent3;}{11{22}{33}}

output.txt1 -
{headertext}{moreheadertext}{id:somecontent;}{11{22}{33}}

output.txt2 -
{headertext2}{moreheadertext}{id:somecontent2;}{11{22}{33}}

output.txt3 -
{headertext3}{moreheadertext}{id:somecontent3;}{11{22}{33}}
0
Comment
Question by:PMGreensted
  • 7
  • 5
  • 3
15 Comments
 
LVL 9

Expert Comment

by:ghostdog74
ID: 20048531
one way

awk 'BEGIN{FS="$"}
 {
   for(i=1;i<=NF;i++){
        print $i > "output-"i"-"NR".txt"
   }
 }

' "file"
0
 

Author Comment

by:PMGreensted
ID: 20049869
Thanks, that works.

Just one variation - If the source file has line breaks there is an output file created for each line as well as each side of the $ delimiter.

To get round this I could remove all line breaks first like this:

tr -d \\n <input.txt> input2.txt

awk 'BEGIN{FS="$"}
 {
   for(i=1;i<=NF;i++){
        print $i > "output-"i"-"NR".txt"
   }
 }

' "input2.txt"

But I would rather have the line breaks if possible. I could replace the line breaks in the first place with:
   tr \\n '~~~~~'  <input.txt> input2.txt
then put them all back again for each output file:
   tr '~~~~~' \\n <output-1-1.txt> output-1-1.txt2
but this seems a bit long-winded. Is there an easier way?
0
 
LVL 85

Expert Comment

by:ozo
ID: 20051234
perl -044pe 'chomp;open STDOUT,">output.txt$."'
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 85

Expert Comment

by:ozo
ID: 20051274
perl -044l12pe 'open STDOUT,">output.txt$."'
0
 

Author Comment

by:PMGreensted
ID: 20052618
Hi ozo,
Sorry I'm not too familiar with perl. What does perl -044l12pe 'open STDOUT,">output.txt$."' do?
0
 
LVL 85

Expert Comment

by:ozo
ID: 20052669
It splits files into "lines" ending in '$' = '\044' and writes each line to separate output.txtn after removing the trailing $ and replacing it with '\n' = '\012'
0
 
LVL 9

Expert Comment

by:ghostdog74
ID: 20053097
actually i am not sure what do you mean.. please post a sample of that input.txt again, including the line breaks , so i could see what's going on. Then also state what is expected to see.as output.
0
 

Author Comment

by:PMGreensted
ID: 20055055
OK, well let's say:

input.txt -
{headertext}{moreheadertext}{id:
somecontent
;}{11{22}{33}}${headertext2}{moreheadertext}{id:
somecontent2
;}{11{22}{33}}${headertext3}{moreheadertext}{id:
somecontent3
;}{11{22}{33}}

I'm trying to get the output like:

output.txt1 -
{headertext}{moreheadertext}{id:somecontent;}{11{22}{33}}

output.txt2 -
{headertext2}{moreheadertext}{id:somecontent2;}{11{22}{33}}

output.txt3 -
{headertext3}{moreheadertext}{id:somecontent3;}{11{22}{33}}

Your awk example worked for my single line input.txt but breaks up the input.txt into seperate output files for each line if input.txt is a multiline file.

Sorry, I know you got it right in the first place, but then realised my input.txt may sometimes have multiple lines.
0
 
LVL 85

Expert Comment

by:ozo
ID: 20055116
So you want to strip newlines?

perl -044pe 's/[$\n]//g;open STDOUT,">output.txt$."'  input.txt
0
 
LVL 85

Expert Comment

by:ozo
ID: 20055121

perl -044pe 's/[\$\n]//g;open STDOUT,">output.txt$."'  input.txt
0
 
LVL 9

Accepted Solution

by:
ghostdog74 earned 1000 total points
ID: 20055293
how about this:

awk '
 {
     line=line$0
 }END{
     m=split(line,a,"$")
     for(i=1;i<=m;i++){
        print a[i] > "output-"i"-"++c".txt"
     }
 }   ' "file"
0
 
LVL 85

Assisted Solution

by:ozo
ozo earned 1000 total points
ID: 20055395
If you want an awk solution
 awk 'BEGIN{RS="$"}{ gsub(/[$\n]/,""); print > ("output.txt" NR)}' input.txt
0
 

Author Comment

by:PMGreensted
ID: 20056326
Thanks ghostdog74 and ozo,

I tested the following and it works fine:

for filename in *.txt
 do
  awk 'BEGIN{RS="$"}{ gsub(/[$]/,""); print > ('$filename'"-"NR)}' $filename
  rm $filename
 done
0
 
LVL 85

Expert Comment

by:ozo
ID: 20056365
Are you saying the earlier suggestions did not work?
0
 

Author Comment

by:PMGreensted
ID: 20057283
No I'm not. All the solutions worked to certain degree. With a few changes here and there they can give the same result. I was just showing you guys what I ended up with.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Let's say you need to move the data of a file system from one partition to another. This generally involves dismounting the file system, backing it up to tapes, and restoring it to a new partition. You may also copy the file system from one place to…
Why Shell Scripting? Shell scripting is a powerful method of accessing UNIX systems and it is very flexible. Shell scripts are required when we want to execute a sequence of commands in Unix flavored operating systems. “Shell” is the command line i…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.
Suggested Courses

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question