Solved

# A regular expression to find equations

Posted on 2004-11-11
220 Views
Hello people,
I am trying to write a regular expression that will look in a string and determine which parts of the string are equations. For example:

Hello this is 2+3 an equation and 1-2+1-3 is one too as well as (1+2-4) but I'm afraid ( 1 / 2 *(1+2)) is an equation too.

This would return 2+3 1-2+1-3 (1+2-4) and ( 1 / 2 *(1+2)).

Numbers may or may not have spaces between them. The symbols are + - * /. Also mind the parentheses. Any ideas? Most of my attempts have failed up to now.

Thank you very much.

Deleted by ee_ai_construct, 500 points refunded. - 11/15/2004 4:30:25 PM PST
0
Question by:theplasma

LVL 12

Expert Comment

Hi theplasma,
I don't think you can do this as a regex - the problem is to store the nesting level somehow.

You need something like Yacc to check it properly.

Cheers!

Stefan
0

LVL 16

Expert Comment

1) Do we have to do parantheses matching also??
2) is "1" an equation too??

I have devised a primitive technique which takes care of simple equations like 1+2 or 1+2+3 or 1+2+3-4.

This successively removes froma a string the last "symbol+digit". In the end, if only a single digit remains, it is a balanced equation. (thats my assumption)

use strict;
use warnings;
my \$var="1" ;
while (\$var =~ m/^.*([\+\-\/\*]\d)\$/) {
print "Matched : \$1\n";
my \$temp=quotemeta \$1;
\$var =~ s/\$temp//;
print "\$var\n" ;
}
if (\$var =~ /^\d\$/) {print "Found" ;}
else { print "Nofound" ;}

Manav
0

LVL 16

Expert Comment

sorry,

this version takes care of multidigit numbers occuring in the string

use strict;
use warnings;
my \$var="1" ;
while (\$var =~ m/^.*([\+\-\/\*]\d)\$/) {
print "Matched : \$1\n";
my \$temp=quotemeta \$1;
\$var =~ s/\$temp//;
print "\$var\n" ;
}
if (\$var =~ /^\d+\$/) {print "Found" ;}
else { print "Nofound" ;}

Manav

0

LVL 16

Expert Comment

Sorry,
this is a way to test whether a string is an equation or not. :(

Manav
0

LVL 13

Expert Comment

Homework. :P
0

Author Comment

nevermind I figured it out, thanks anyway manav.
0

LVL 12

Expert Comment

gripe,
Yep.

theplasma: Think about some basic rules for an equation. This will be:

1. All equations must only consist of parantheses, digits or operators.
2. The number of closing parantheses must be equal to the number of opening parantheses.
3. Each binary operator must have either a digit or an opening parathesis to its right.
4. Each binary operator must have either a digit or a closing parathesis to its left.

This is not yet sufficient to express validy of parenthesis nesting, but should be a good pre-filter for faulty equations.

And get rid of these spaces first.

Stefan
0

LVL 84

Expert Comment

\$_ = "Hello this is 2+3 an equation and 1-2+1-3 is one too as well as (1+2-4) but I'm afraid ( 1 / 2 *(1+2)) is an equation too.";
for( grep{ /\d/ && !m([(*/+-]\s*[*/)]) && !/\d\s+\d/ &&((my \$r=\$_) =~ tr/()//cd, eval{qr/\$r/}, !\$@)}/([-+*\/()\s\d]+)/g ){
print "\$_\n";
}
0

LVL 84

Expert Comment

What was the solution you figured out?
0

Accepted Solution

ozo, since you asked, the regular expression I used is:
[\s]*[(]*([(]*[-+]?\d+(\.\d+)?[)]*[\s]*([\+\-\*\/]{1}[\s]*[(]*[-+]?\d+(\.\d+)?[)]*)+[)]*[\+\-\*\/]?)+[\s]*
which pretty much does the trick, don't you think?
0

LVL 84

Expert Comment

( 1 / 2 *(1+2)) is an equation too
1 / 2
and
(1+2))
( 1 / 2 *(1+2))
also
1 + 2 * 3 + 4 * 5
returns
1 + 2
3 + 4
missing the 5
1 + ( 2 * 3 )
misses the 1
it also doesn't find
(9)
and
(((1+2)))
returns
1+2)))
and 1)+(2 is not an equation, but you permit it

I didn't realize you wanted to permit numbers like 1.2, so my modified pattern is
for( grep{ /\d/&&!/\.\D|\D\./&&!m([(*/+-]\s*[*/)])&&!/\d\s+\d/ &&((\$r=\$_)=~tr/()//cd, eval{qr/\$r/}, !\$@)}/([-+*\/().\s\d]+)/g ){
print "\$_\n";
}
If you require at least one of [-+*/], add
for( grep{ /\d/&&/[-+\/*]/&&!/\.\D|\D\./&&!m([(*/+-]\s*[*/)])&&!/\d\s+\d/ &&((\$r=\$_)=~tr/()//cd, eval{qr/\$r/}, !\$@)}/([-+*\/().\s\d]+)/g ){
print "\$_\n";
}
0

## Featured Post

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…