We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you a podcast all about Citrix Workspace, moving to the cloud, and analytics & intelligence. Episode 2 coming soon!Listen Now

x

# parsing algebraic expression into binary tree?

on
Medium Priority
1,148 Views
If I want to parse an algebraic expression into a binary tree, how can I determine where the root is?

I have written a parser to solve an expression as it parses it (using algebraic rules) but this would be too slow if the expression must be used multiple times.  I decided to first parse the expression into a binary tree, then solve the expression using the tree.  Here's 2 examples...

(a+b)*c

my parser would work like this...
add a + b...get result...multiply result by c...
the tree could represent this by...

*
/   \
+      c
/    \
a     b

(sorry if it is difficult to read the tree :( )
but,

a/b+(c-d)

my parser would work like this...
divide a by b...get result1...subtract d from c...get result2...add result2 to result1
the tree couldn't be represented in this fashion...

divide
/        \
a            b

HEY, how do I continue with the tree, divide shouldn't be the root, it should be...
+
/      \
div        -
/    \     /   \
a      b  c     d
...but my parser didn't do this.  So how can I find the correct root using only the algebraic rules?  Is there another way?  An algorithm to implement this correctly or an example code with explanation will be immensely helpful.

Thanks,

Scott
Comment
Watch Question

## View Solution Only

Commented:
Edited text of question

Commented:
From the comp.lang.c FAQ at
http://www.eskimo.com/~scs/C-faq/q18.14.html

Two available packages are "defunc", available from
ftp://sunsite.unc.edu/pub/packages/development/libraries/
defunc-1.3.tar.Z
and "parse" at ftp://lamont.ldgo.columbia.edu . Other options include the S-Lang interpreter, available from
ftp://amy.tch.harvard.edu/pub/slang
and the shareware Cmm (``C-minus-minus'' or ``C minus the hard
stuff'').
There is also some parsing/evaluation code in Software Solutions
in C (chapter 12, pp. 235-55).

Commented:
A couple of the links were broken and I didn't have anything to open the tar.Z extension.  At any rate I would rather you explained a general algorithm and approach for this.  Some sample code would be even better.  One idea I had was to parse the entire expression into a stack.  Then the last item on the stack WILL be the root to the tree.  Is there a better way?

Commented:
> I didn't have anything to open the tar.Z
http://www.winzip.com .

> One idea I had was to parse the entire expression into a stack
For small expression, this is probably OK.
For larger ones, you can stop to calculate an already parsed
sub-expression whenever
1. Parenthesis are closed; or
2. You reach a lower priority operator. (for example '+' after
a '/' sub-expression, or '*' after a '^' (power), and so on.)

You can find yet another example source code in the book "The C++
Programming Language" by Bjarne Stroustrup, chapter 3. This
example is, however, much simpler and less general from the
others.

Hope this helps.

Commented:
Maybe my question wasn't clear.  I need someone to EXPLAIN an algorithm to me to parse an expression into a binary tree.  Code with explanation accompanying the algorithm will be greatly appreciated.  I don't need links to someone's unexplained and outdated code.  I don't have the resources to go out and get all these books either.

Thanks

Commented:
There is no *one* algorithm. Below is a simple one. It parses
recoursivly (using xparse() ), until it reaches a number
(getoperand() ). Then it applies the arithmetic operation to it
(apply() ).

int getoperand(char **p);           // reads a number
int apply(int x, char op, int y);   // applies an operation
int xparse(char **p);               // parses an expression

int getoperand(char **p)
{
int val;
char ch;

if (**p == '(') {          // sub-expression
(*p)++;
return xparse (p);
}

val = 0;

while (1) {                // number
ch = **p;
if (ch < '0' || ch > '9')
return val;
val = val * 10 + ch - '0';
(*p)++;
}
}

int apply(int x, char op, int y)
{
switch (op) {
case '+': return x + y;
case '-': return x - y;
case '*': return x * y;
case '/': return x / y;
}
}

int xparse(char **p)
{
int x, y;
char op, lastop;

x = 0;
lastop = '+';
while (1) {
y = getoperand (p);
op = **p;
(*p)++;
x = apply (x, lastop, y);
if (op == '\0' || op == ')')
return x;
lastop = op;
}
}

int arithmeticparse(char *line)
{
char **p;
p = &line;
return xparse (p);
}

#ifdef TEST
/*==========================================================================*/
#include <stdio.h>

int main()
{
int n;
n = arithmeticparse ("5+6");
printf ("5+6 = %d\n", n);
n = arithmeticparse ("5*(6+2)");
printf ("5*(6+2) = %d\n", n);
n = arithmeticparse ("5*6+2");
printf ("5*6+2 = %d\n", n);
n = arithmeticparse ("2+(5*6)");
printf ("2+(5*6) = %d\n", n);

return 0;
}
#endif

Commented:
I have already written a very similar routine.  From my original question, what I don't have an algorithm for is getting it into a BINARY TREE.  In the method above, you will have to reparse the expression each time to evaluate the answer.  I need it to be faster.  The idea I had was to implement something like the above routine and , instead of evaluating it while parsing, set it up as a binary tree so that anytime I need to evaluated it, it doesn't need to be parsed, just step through the tree.  Hopefully much quicker.

:)

Commented:
Here is some (untested) code that builds an expression tree (which you can later traverse to calculate the result)...

If you like this, please reject other answer and I can post this as an answer (and get some more points :-)

/* this is the token type for the token stream */
enum OpType {
ENDEXPR = 0,
SUB = '-',
MUL = '*',
DIV = '/',
LPAREN = '(',
RPAREN = ')',
CONST = '#'
};

/* here is a token */
struct Token {
OpType op;      /* one of the above symbols */
double number;      /* if op==CONST */
};

/* here is a node for the expression tree */
struct Node {
OpType op;      /* type of node (operation or constant) */
double number;      /* if op==CONST */
struct Node* pLhs;      /* lhs sub-expression tree */
struct Node* pRhs;      /* rhs sub-expression tree */
}

/* create a new node with specified op type and args */
Node* NewNode (OpType op, Node* pLhs, Node* pRhs) {
Node* pNode = malloc(sizeof(Node));
pNode->op = op;
pNode->number = 0.0;
pNode->pLhs = pLhs;
pNode->pRhs = pRhs;
return pNode;
}

/* create a new node for a constant number */
Node* NewNodeNumber (double number) {
Node* pNode = malloc(sizeof(Node));
pNode->op = CONST;
pNode->number = number;
pNode->pLhs = NULL;
pNode->pRhs = NULL;
return pNode;
}

Node* IsConst(Node* pNode) {
return pNode && pNode->op == CONST;
}

Node* IsOp(Node* pNode) {
return pNode && pNode->op != CONST;
}

/* return Lhs pointer for node (if any) */
Node* Lhs(Node* pNode) {
return IsOp(pNode) ? pNode->pLhs : NULL;
}

/* return Rhs pointer for node (if any) */
Node* Rhs(Node* pNode) {
return IsOp(pNode) ? pNode->pRhs : NULL;
}

/* return number value for node (if any) */
double Number(Node* pNode) {
return IsConst(pNode) ? pNode->number : 0.0;
}

/* use this to delete a node and everything under it (for cleanup) */
void DeleteNode (Node* pNode) {
if (pNode) {
if (IsOp(pNode)) {
DeleteNode(Lhs(pNode));
DeleteNode(Rhs(pNode));
}
free(pNode);
}
}

/* assume an array of tokens has been built elswhere */
/* NOTE: token stream MUST finish with an ENDEXPR token */
/* then build an expression tree from the tokens */
/* and return a poitner to the root of the tree */
Node* BuildTree (Token tokens[]) {
int p = 0;
}

/* build a tree for an add or subtract operation */
Node* BuildAddSub (Token tokens[], int& p) {
Node* pNode = BuildMulDiv(tokens,p);
while (pNode) {
OpType op = tokens[*p].op;
if (op == ADD || op == SUB) {
Node* pOther;
(*p)++;
pOther = BuildMulDiv(tokens,p);
pNode = NewNode(op,pNode,pOther);
} else {
break;
}
}
return pNode;
}

/* build a tree for a multiply or divide operation */
Node* BuildMulDiv (Token tokens[], int& p) {
Node* pNode = BuildSimple(tokens,p);
while (pNode) {
OpType op = tokens[*p].op;
if (op == MUL || op == DIV) {
Node* pOther;
(*p)++;
pOther = BuildSimple(tokens,p);
pNode = NewNode(op,pNode,pOther);
} else {
break;
}
}
return pNode;
}

/* build a tree for a () expression or a number */
Node* BuildSimple (Token tokens[], int& p) {
Node* pNode = NULL;
OpType op = tokens[*p].op;
switch (op) {
case LPAREN:
(*p)++;
pNode = BuildAddSub(tokens,p); /* recursion here */
op = tokens[*p].op;
if (op != RPAREN) return NULL;
(*p)++;
break;
case CONST:
pNode = NewNodeNubmer(tokens[*p].number);
(*p)++;
break;
}
return pNode;
}

Commented:
RONSLOW:  That is exactly what I needed.  I haven't used binary trees for my data structures much.  If you changed the C stuff to C++ (malloc to new, etc.) I would be ESPECIALLY grateful.  :)

Commented:
Any more wishes? Perhaps to develop a gui for the program?
(ESPECIALLY sarcastic)
Commented:
I assume then that my above comment is an answer..

here 'tis again so you can grade me -- Thanks :-)

/* this is the token type for the token stream */
enum OpType {
ENDEXPR = 0,
SUB = '-',
MUL = '*',
DIV = '/',
LPAREN = '(',
RPAREN = ')',
CONST = '#'
};

/* here is a token */
struct Token {
OpType op; /* one of the above symbols */
double number; /* if op==CONST */
};

/* here is a node for the expression tree */
struct Node {
OpType op; /* type of node (operation or constant) */
double number; /* if op==CONST */
struct Node* pLhs; /* lhs sub-expression tree */
struct Node* pRhs; /* rhs sub-expression tree */
}

/* create a new node with specified op type and args */
Node* NewNode (OpType op, Node* pLhs, Node* pRhs) {
Node* pNode = malloc(sizeof(Node));
pNode->op = op;
pNode->number = 0.0;
pNode->pLhs = pLhs;
pNode->pRhs = pRhs;
return pNode;
}

/* create a new node for a constant number */
Node* NewNodeNumber (double number) {
Node* pNode = malloc(sizeof(Node));
pNode->op = CONST;
pNode->number = number;
pNode->pLhs = NULL;
pNode->pRhs = NULL;
return pNode;
}

Node* IsConst(Node* pNode) {
return pNode && pNode->op == CONST;
}

Node* IsOp(Node* pNode) {
return pNode && pNode->op != CONST;
}

/* return Lhs pointer for node (if any) */
Node* Lhs(Node* pNode) {
return IsOp(pNode) ? pNode->pLhs : NULL;
}

/* return Rhs pointer for node (if any) */
Node* Rhs(Node* pNode) {
return IsOp(pNode) ? pNode->pRhs : NULL;
}

/* return number value for node (if any) */
double Number(Node* pNode) {
return IsConst(pNode) ? pNode->number : 0.0;
}

/* use this to delete a node and everything under it (for cleanup) */
void DeleteNode (Node* pNode) {
if (pNode) {
if (IsOp(pNode)) {
DeleteNode(Lhs(pNode));
DeleteNode(Rhs(pNode));
}
free(pNode);
}
}

/* assume an array of tokens has been built elswhere */
/* NOTE: token stream MUST finish with an ENDEXPR token */
/* then build an expression tree from the tokens */
/* and return a poitner to the root of the tree */
Node* BuildTree (Token tokens[]) {
int p = 0;
}

/* build a tree for an add or subtract operation */
Node* BuildAddSub (Token tokens[], int& p) {
Node* pNode = BuildMulDiv(tokens,p);
while (pNode) {
OpType op = tokens[*p].op;
if (op == ADD || op == SUB) {
Node* pOther;
(*p)++;
pOther = BuildMulDiv(tokens,p);
pNode = NewNode(op,pNode,pOther);
} else {
break;
}
}
return pNode;
}

/* build a tree for a multiply or divide operation */
Node* BuildMulDiv (Token tokens[], int& p) {
Node* pNode = BuildSimple(tokens,p);
while (pNode) {
OpType op = tokens[*p].op;
if (op == MUL || op == DIV) {
Node* pOther;
(*p)++;
pOther = BuildSimple(tokens,p);
pNode = NewNode(op,pNode,pOther);
} else {
break;
}
}
return pNode;
}

/* build a tree for a () expression or a number */
Node* BuildSimple (Token tokens[], int& p) {
Node* pNode = NULL;
OpType op = tokens[*p].op;
switch (op) {
case LPAREN:
(*p)++;
pNode = BuildAddSub(tokens,p); /* recursion here */
op = tokens[*p].op;
if (op != RPAREN) return NULL;
(*p)++;
break;
case CONST:
pNode = NewNodeNubmer(tokens[*p].number);
(*p)++;
break;
}
return pNode;
}

Not the solution you were looking for? Getting a personalized solution is easy.

Commented:
##### Thanks for using Experts Exchange.

• View three pieces of content (articles, solutions, posts, and videos)
• Ask the experts questions (counted toward content limit)
• Customize your dashboard and profile