Hello people,
I am trying to write a regular expression that will look in a string and determine which parts of the string are equations. For example:

Hello this is 2+3 an equation and 1-2+1-3 is one too as well as (1+2-4) but I'm afraid ( 1 / 2 *(1+2)) is an equation too.

This would return 2+3 1-2+1-3 (1+2-4) and ( 1 / 2 *(1+2)).

Numbers may or may not have spaces between them. The symbols are + - * /. Also mind the parentheses. Any ideas? Most of my attempts have failed up to now.

1) Do we have to do parantheses matching also??
2) is "1" an equation too??

I have devised a primitive technique which takes care of simple equations like 1+2 or 1+2+3 or 1+2+3-4.

This successively removes froma a string the last "symbol+digit". In the end, if only a single digit remains, it is a balanced equation. (thats my assumption)

use strict;
use warnings;
my $var="1" ;
while ($var =~ m/^.*([\+\-\/\*]\d)$/) {
print "Matched : $1\n";
my $temp=quotemeta $1;
$var =~ s/$temp//;
print "$var\n" ;
}
if ($var =~ /^\d$/) {print "Found" ;}
else { print "Nofound" ;}

theplasma: Think about some basic rules for an equation. This will be:

1. All equations must only consist of parantheses, digits or operators.
2. The number of closing parantheses must be equal to the number of opening parantheses.
3. Each binary operator must have either a digit or an opening parathesis to its right.
4. Each binary operator must have either a digit or a closing parathesis to its left.

This is not yet sufficient to express validy of parenthesis nesting, but should be a good pre-filter for faulty equations.

$_ = "Hello this is 2+3 an equation and 1-2+1-3 is one too as well as (1+2-4) but I'm afraid ( 1 / 2 *(1+2)) is an equation too.";
for( grep{ /\d/ && !m([(*/+-]\s*[*/)]) && !/\d\s+\d/ &&((my $r=$_) =~ tr/()//cd, eval{qr/$r/}, !$@)}/([-+*\/()\s\d]+)/g ){
print "$_\n";
}

ozo, since you asked, the regular expression I used is:
[\s]*[(]*([(]*[-+]?\d+(\.\d+)?[)]*[\s]*([\+\-\*\/]{1}[\s]*[(]*[-+]?\d+(\.\d+)?[)]*)+[)]*[\+\-\*\/]?)+[\s]*
which pretty much does the trick, don't you think?

( 1 / 2 *(1+2)) is an equation too
but your espression returns
1 / 2
and
(1+2))
instead of
( 1 / 2 *(1+2))
also
1 + 2 * 3 + 4 * 5
returns
1 + 2
3 + 4
missing the 5
1 + ( 2 * 3 )
misses the 1
it also doesn't find
(9)
and
(((1+2)))
returns
1+2)))
and 1)+(2 is not an equation, but you permit it

I didn't realize you wanted to permit numbers like 1.2, so my modified pattern is
for( grep{ /\d/&&!/\.\D|\D\./&&!m([(*/+-]\s*[*/)])&&!/\d\s+\d/ &&(($r=$_)=~tr/()//cd, eval{qr/$r/}, !$@)}/([-+*\/().\s\d]+)/g ){
print "$_\n";
}
If you require at least one of [-+*/], add
for( grep{ /\d/&&/[-+\/*]/&&!/\.\D|\D\./&&!m([(*/+-]\s*[*/)])&&!/\d\s+\d/ &&(($r=$_)=~tr/()//cd, eval{qr/$r/}, !$@)}/([-+*\/().\s\d]+)/g ){
print "$_\n";
}

0

Featured Post

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

I don't think you can do this as a regex - the problem is to store the nesting level somehow.

You need something like Yacc to check it properly.

Cheers!

Stefan