edskee
asked on
Regular Expression question - matching brackets
Is there an easy way to do this with a perl regular expression?
I have an expression with matching brackets, but they can be nested, and I need to match the entire group.
For example, I have the string: MyTest{out{inner}er}blahbl ah
Is there an expression I can write that will match {out{inner}er}? If I search for the } I will get {out{inner} which is wrong. I could say find the second }, but then if I have {out{in{absoluteinner}ner} er} that breaks it.
Anyone have any idea? I'm sure it should be easy to do, but I cant figure it out.
I have an expression with matching brackets, but they can be nested, and I need to match the entire group.
For example, I have the string: MyTest{out{inner}er}blahbl
Is there an expression I can write that will match {out{inner}er}? If I search for the } I will get {out{inner} which is wrong. I could say find the second }, but then if I have {out{in{absoluteinner}ner}
Anyone have any idea? I'm sure it should be easy to do, but I cant figure it out.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Well, I'm using the expression in a PHP script, so the perl module wont help me any... but I did write an extra function to check for the balanced braces, then match that number of braces before returning... so for anyone searching for an answer for this I accepted ultimatemike's answer, although jmcg's was closer to what I needed. Thanks.
Mike,
I looked at Text::Balanced and thought "wow!". I believe it can be used to answer this problem, but I'd certainly appreciate (and I bet Edskee would, too) a gentle introduction to this very complex module. Can you show a worked-out example for the question posed here by Edskee? Please?
I looked at Text::Balanced and thought "wow!". I believe it can be used to answer this problem, but I'd certainly appreciate (and I bet Edskee would, too) a gentle introduction to this very complex module. Can you show a worked-out example for the question posed here by Edskee? Please?
I agree. I don't have access to the module right now - I'll have to hold off until tomorrow morning to give it a shot, but I'll try and get an example worked out to help out with future questions. :)
Turns out Text::Balanced is installed by default in ActivePerl :)
#!perl -w
use strict;
use Text::Balanced qw (
extract_bracketed
);
my $string = 'MyTest{out{inner}er}blahb lah';
#The first parameter is the string to process
#The second is the brackets we're using. (You can have multiple kinds of nested brackets)
#The third is the regular expression for the prefix that we wish to ignore. In this case,
#it's checking if the bracket is after the start of the line and a string of alphanumeric
#characters.
my @result = extract_bracketed( $string, '{}', qr/^[A-Za-z0-9]*/);
print "The balanced text: $result[0]\n";
print "The dropped suffix: $result[1]\n";
print "The dropped prefix: $result[2]\n";
If that's still unclear, I'd be happy to try and clarify it further.
#!perl -w
use strict;
use Text::Balanced qw (
extract_bracketed
);
my $string = 'MyTest{out{inner}er}blahb
#The first parameter is the string to process
#The second is the brackets we're using. (You can have multiple kinds of nested brackets)
#The third is the regular expression for the prefix that we wish to ignore. In this case,
#it's checking if the bracket is after the start of the line and a string of alphanumeric
#characters.
my @result = extract_bracketed( $string, '{}', qr/^[A-Za-z0-9]*/);
print "The balanced text: $result[0]\n";
print "The dropped suffix: $result[1]\n";
print "The dropped prefix: $result[2]\n";
If that's still unclear, I'd be happy to try and clarify it further.
$_ = "MyTest{out{inner}er}blahb lah";
($re=$_)=~s/((\{)|(\})|.)/ ${{'{'=>'( '}}{$2}\Q$ 1\E${{'}'= >')'}}{$3} /gs;
$result = (/$re/)[0];
($re=$_)=~s/((\{)|(\})|.)/
$result = (/$re/)[0];
To simply get the text enclosed by the outermost pair of braces, however, you should be able to do that with something like:
($result) = $string =~ /{(.*)}/;
The greedy match should consume intermediate closing braces until it finds the last one.
For the last word on regular expressions, you should take a look at Jeffry Friedl's book Mastering Regular Expressions.
http://www.oreilly.com/catalog/regex/