Split retaining delimiter?

Hi Experts,

How can I split input records using a couple of delmiters like "<" and ">}", retaining the delimiters in the output, and doing all this as a one-liner?

So, assuming the input is:

The 1st record should end up in elements of an array like this:
    $F[0]: '<abc>}'
    $F[1]: '<def>}'
    $F[2]: 'g'
    $F[3]: '<h>}'
And the 2nd:
    $F[0]: '<1>}'
    $F[1]: '<2>}'

I would have expected an answer something like this:
    perl -F/\<|\>}/ -ane 'for (@F) {print $_."\n"}' infile
But that gives me errors in bash and ksh, like this:
    -bash: >}/: No such file or directory

Any ideas?
LVL 13
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

To make your perl one-liner work with the shell, you have to quote all of the shell metacharacters:

perl -F'/<|>}/' -ane 'for (@F) {print $_."\n"}' infile

but I doubt that will give you exactly what you want.


What you might try is something like this:

#! /usr/bin/env perl

use Data::Dumper;

@Input = split /\n/, <<EOInput;

foreach( @Input ) {
   @F = grep { /\S/ } split /(<.*?>})/;
   print Dumper( \@F);


Using split with a pattern containing grouping parentheses causes it to return the delimiters as well as the piece in between the delimiters. I used 'grep' to throw away the empty members of the returned list.

For your simple example, the pattern matches the beginning delimiter with an ending delimiter. If your actual data doesn't behave this way (nesting, for instance), then you'll probably need a more powerful parsing approach to get the result you want.


I'm not sure what your one-liner requirement is. The grep+split operation can certainly be put into a oneliner, but you'll need to do something with the @F array.
perl -F'(?=<)|(?<=>})' -ane 'for (@F) {print $_."\n"}'

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
tel2Author Commented:
Thanks jmcq,
I also notice that:
  perl -F'/<|>}/' -ane 'for (@F) {print $_."\n"}' infile
seems to be abbreviatable to:
  perl -F'<|>}' ...
As you know, neither retain the delimiters, but it's still good to know.
Your multi-line script seems to do the job, but my preference is a one-liner.  The requirement is, I like them for their brevity.  Actually I have no need for either right now, but if I can see it in a one-liner, I can probably expand it to a multi-liner easier than the other way.

Nice work, ozo,
Brief, but to the point, as usual.
Yes, I thought look-around assertions might be what's needed here.  Didn't know quite how to do it though.

It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.