Split a file in UNIX into seperate files based on a delimiter character.

Hi Experts,
How would you split a file (input.txt)  into seperate files (output.txtn) based on the delimiter $ only.

input.txt -
{headertext}{moreheadertext}{id:somecontent;}{11{22}{33}}${headertext2}{moreheadertext}{id:somecontent2;}{11{22}{33}}${headertext3}{moreheadertext}{id:somecontent3;}{11{22}{33}}

output.txt1 -
{headertext}{moreheadertext}{id:somecontent;}{11{22}{33}}

output.txt2 -
{headertext2}{moreheadertext}{id:somecontent2;}{11{22}{33}}

output.txt3 -
{headertext3}{moreheadertext}{id:somecontent3;}{11{22}{33}}
PMGreenstedAsked:
Who is Participating?
 
ghostdog74Connect With a Mentor Commented:
how about this:

awk '
 {
     line=line$0
 }END{
     m=split(line,a,"$")
     for(i=1;i<=m;i++){
        print a[i] > "output-"i"-"++c".txt"
     }
 }   ' "file"
0
 
ghostdog74Commented:
one way

awk 'BEGIN{FS="$"}
 {
   for(i=1;i<=NF;i++){
        print $i > "output-"i"-"NR".txt"
   }
 }

' "file"
0
 
PMGreenstedAuthor Commented:
Thanks, that works.

Just one variation - If the source file has line breaks there is an output file created for each line as well as each side of the $ delimiter.

To get round this I could remove all line breaks first like this:

tr -d \\n <input.txt> input2.txt

awk 'BEGIN{FS="$"}
 {
   for(i=1;i<=NF;i++){
        print $i > "output-"i"-"NR".txt"
   }
 }

' "input2.txt"

But I would rather have the line breaks if possible. I could replace the line breaks in the first place with:
   tr \\n '~~~~~'  <input.txt> input2.txt
then put them all back again for each output file:
   tr '~~~~~' \\n <output-1-1.txt> output-1-1.txt2
but this seems a bit long-winded. Is there an easier way?
0
Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

 
ozoCommented:
perl -044pe 'chomp;open STDOUT,">output.txt$."'
0
 
ozoCommented:
perl -044l12pe 'open STDOUT,">output.txt$."'
0
 
PMGreenstedAuthor Commented:
Hi ozo,
Sorry I'm not too familiar with perl. What does perl -044l12pe 'open STDOUT,">output.txt$."' do?
0
 
ozoCommented:
It splits files into "lines" ending in '$' = '\044' and writes each line to separate output.txtn after removing the trailing $ and replacing it with '\n' = '\012'
0
 
ghostdog74Commented:
actually i am not sure what do you mean.. please post a sample of that input.txt again, including the line breaks , so i could see what's going on. Then also state what is expected to see.as output.
0
 
PMGreenstedAuthor Commented:
OK, well let's say:

input.txt -
{headertext}{moreheadertext}{id:
somecontent
;}{11{22}{33}}${headertext2}{moreheadertext}{id:
somecontent2
;}{11{22}{33}}${headertext3}{moreheadertext}{id:
somecontent3
;}{11{22}{33}}

I'm trying to get the output like:

output.txt1 -
{headertext}{moreheadertext}{id:somecontent;}{11{22}{33}}

output.txt2 -
{headertext2}{moreheadertext}{id:somecontent2;}{11{22}{33}}

output.txt3 -
{headertext3}{moreheadertext}{id:somecontent3;}{11{22}{33}}

Your awk example worked for my single line input.txt but breaks up the input.txt into seperate output files for each line if input.txt is a multiline file.

Sorry, I know you got it right in the first place, but then realised my input.txt may sometimes have multiple lines.
0
 
ozoCommented:
So you want to strip newlines?

perl -044pe 's/[$\n]//g;open STDOUT,">output.txt$."'  input.txt
0
 
ozoCommented:

perl -044pe 's/[\$\n]//g;open STDOUT,">output.txt$."'  input.txt
0
 
ozoConnect With a Mentor Commented:
If you want an awk solution
 awk 'BEGIN{RS="$"}{ gsub(/[$\n]/,""); print > ("output.txt" NR)}' input.txt
0
 
PMGreenstedAuthor Commented:
Thanks ghostdog74 and ozo,

I tested the following and it works fine:

for filename in *.txt
 do
  awk 'BEGIN{RS="$"}{ gsub(/[$]/,""); print > ('$filename'"-"NR)}' $filename
  rm $filename
 done
0
 
ozoCommented:
Are you saying the earlier suggestions did not work?
0
 
PMGreenstedAuthor Commented:
No I'm not. All the solutions worked to certain degree. With a few changes here and there they can give the same result. I was just showing you guys what I ended up with.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.