Improve company productivity with a Business Account.Sign Up

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1312
  • Last Modified:

ANTLR Grammar Problem

Hi,

I am currently trying to create a formula-evaluator using an ANTLR generated Java Parser. In oder to get a kick-start, I used a grammar from this site http://arcanecoder.blogspot.com/2008/04/using-antlr-to-create-excel-like.html
At first it seemed to work like a charm. Until one of my test-users created a false expression. In the Grammar-File I am using, the Or operation is represented by "||" a user entered a forumual using "OR" instead. Unfortunately my parser doesn't fire an exception but simply quits after parsing the first part of the formula. As an example

TRUE OR FALSE evaluates to TRUE
FALSE OR TRUE evaluates to FALSE

How can I change the grammar not to quit but to identify the formula as invalid and to fire an exception?


grammar SecurityRules;

options {
        backtrack = true;
        output=AST;
	ASTLabelType=CommonTree;
        k=2;
}

tokens {
	POS;
	NEG;
	CALL;
}

@header {package de.cware.cweb.security.service.parser;}
@lexer::header{package de.cware.cweb.security.service.parser;}

formula
	: (EQ!)? expression
	;

//The highest precedence expression is the most deeply nested
//Precedence ties are parsed left to right
//Expression starts with the lowest precedece rule
expression		
	: boolExpr
	;
boolExpr
	: concatExpr ((AND | OR | LT | LTEQ | GT | GTEQ | EQ | NOTEQ)^ concatExpr)*
	;
concatExpr
	: sumExpr (CONCAT^ sumExpr)*
	;
sumExpr
	: productExpr ((SUB | ADD)^ productExpr)*
	;
productExpr
	: expExpr ((DIV | MULT)^ expExpr)*
	;
expExpr
	: unaryOperation (EXP^ unaryOperation)*
	;
unaryOperation
	: NOT^ operand
	| ADD o=operand -> ^(POS $o)
	| SUB o=operand -> ^(NEG $o)
	| operand
	;
// the highest precedence rule uses operand
operand
	: literal 
	| functionExpr -> ^(CALL functionExpr)
	| percent
	| VARIABLE
	| LPAREN expression RPAREN -> ^(expression)
	;
functionExpr
	: FUNCNAME LPAREN! (expression (COMMA! expression)*)? RPAREN!
	;
literal
	: NUMBER 
	| STRING 
	| TRUE
	| FALSE
	;
percent
	: NUMBER PERCENT^
	;

STRING
	:
	'\"'
		( options {greedy=false;}
		: ESCAPE_SEQUENCE
		| ~'\\'
		)*
	'\"'
	|
	'\''
		( options {greedy=false;}
		: ESCAPE_SEQUENCE
		| ~'\\'
		)*
	'\''
	;
WHITESPACE
	: (' ' | '\n' | '\t' | '\r')+ {skip();};
TRUE
	: ('t'|'T')('r'|'R')('u'|'U')('e'|'E')
	;
FALSE
	: ('f'|'F')('a'|'A')('l'|'L')('s'|'S')('e'|'E')
	;
	
NOTEQ           : '!=';
LTEQ            : '<=';
GTEQ            : '>=';
AND		: '&&';
OR		: '||';
NOT		: '!';
EQ              : '=';
LT              : '<';
GT              : '>';

EXP             : '^';
MULT            : '*';
DIV             : '/';
ADD             : '+';
SUB             : '-';

CONCAT          : '&';

LPAREN          : '(';
RPAREN          : ')';
COMMA           : ',';
PERCENT         : '%';

VARIABLE
	: '[' ~('[' | ']')+ ']'
	;
FUNCNAME
	: (LETTER)+
	;
NUMBER
	: (DIGIT)+ ('.' (DIGIT)+)?
	;

fragment
LETTER 
	: ('a'..'z') | ('A'..'Z')
	;
fragment
DIGIT
	: ('0'..'9')
	;
fragment
ESCAPE_SEQUENCE
	: '\\' 't'
	| '\\' 'n'
	| '\\' '\"'
	| '\\' '\''
	| '\\' '\\'
	;

Open in new window

0
ChristoferDutz
Asked:
ChristoferDutz
  • 2
1 Solution
 
arch-itectCommented:
Have you noticed that both the TRUE and FALSE constructs contain ('e'|'E')

If that is not the only problem, could you post their test formula?
0
 
arch-itectCommented:
ROFL, never mind
0
 
ChristoferDutzAuthor Commented:
;-) ... well both have an "e" in them, dont' they? ;-)

In the meanwhile I solved the problem myself. The problem was that the formula didn't state that after the expression nothing is allowed to follow. By changing:

formula
      : (EQ!)? expression
      ;

To:

formula
      : (EQ!)? expression EOF
      ;

Made the parser treat the input as invalid and after overriding the recovery-code by eavil-exception-throwing-code, I got the results I needed.

grammar SecurityRules;

options {
        backtrack = true;
        output=AST;
	ASTLabelType=CommonTree;
        k=2;
}

tokens {
	POS;
	NEG;
	CALL;
}

@header {package de.cware.cweb.services.evaluator.parser;}
@lexer::header{package de.cware.cweb.services.evaluator.parser;}

@parser::members {

  @Override
  protected Object recoverFromMismatchedToken(IntStream input, int ttype, BitSet follow) throws RecognitionException {
    throw new MismatchedTokenException(ttype, input);
  }

  @Override
  public Object recoverFromMismatchedSet(IntStream input, RecognitionException e, BitSet follow) throws RecognitionException {
    throw e;
  }

}

@rulecatch {
    catch (RecognitionException e) {
        throw e;
    }
}

@lexer::members {
    @Override
    public void reportError(RecognitionException e) {
        throw new RuntimeException(e);
    }

}    

formula
	: (EQ!)? expression EOF
	;

//The highest precedence expression is the most deeply nested
//Precedence ties are parsed left to right
//Expression starts with the lowest precedece rule
expression		
	: boolExpr
	;
boolExpr
	: concatExpr ((AND | OR | LT | LTEQ | GT | GTEQ | EQ | NOTEQ)^ concatExpr)*
	;
concatExpr
	: sumExpr (CONCAT^ sumExpr)*
	;
sumExpr
	: productExpr ((SUB | ADD)^ productExpr)*
	;
productExpr
	: expExpr ((DIV | MULT)^ expExpr)*
	;
expExpr
	: unaryOperation (EXP^ unaryOperation)*
	;
unaryOperation
	: NOT^ operand
	| ADD o=operand -> ^(POS $o)
	| SUB o=operand -> ^(NEG $o)
	| operand
	;
// the highest precedence rule uses operand
operand
	: literal 
	| functionExpr -> ^(CALL functionExpr)
	| percent
	| VARIABLE
	| LPAREN expression RPAREN -> ^(expression)
	;
functionExpr
	: FUNCNAME LPAREN! (expression (COMMA! expression)*)? RPAREN!
	;
literal
	: NUMBER 
	| STRING 
	| TRUE
	| FALSE
	;
percent
	: NUMBER PERCENT^
	;

STRING
	:
	'\"'
		( options {greedy=false;}
		: ESCAPE_SEQUENCE
		| ~'\\'
		)*
	'\"'
	|
	'\''
		( options {greedy=false;}
		: ESCAPE_SEQUENCE
		| ~'\\'
		)*
	'\''
	;
WHITESPACE
	: (' ' | '\n' | '\t' | '\r')+ {skip();};
TRUE
	: ('t'|'T')('r'|'R')('u'|'U')('e'|'E')
	;
FALSE
	: ('f'|'F')('a'|'A')('l'|'L')('s'|'S')('e'|'E')
	;
	
NOTEQ           : '!=';
LTEQ            : '<=';
GTEQ            : '>=';
AND		: '&&';
OR		: '||';
NOT		: '!';
EQ              : '=';
LT              : '<';
GT              : '>';

EXP             : '^';
MULT            : '*';
DIV             : '/';
ADD             : '+';
SUB             : '-';

CONCAT          : '&';

LPAREN          : '(';
RPAREN          : ')';
COMMA           : ',';
PERCENT         : '%';

VARIABLE
	: '[' ~('[' | ']')+ ']'
	;
FUNCNAME
	: (LETTER)+
	;
NUMBER
	: (DIGIT)+ ('.' (DIGIT)+)?
	;

fragment
LETTER 
	: ('a'..'z') | ('A'..'Z')
	;
fragment
DIGIT
	: ('0'..'9')
	;
fragment
ESCAPE_SEQUENCE
	: '\\' 't'
	| '\\' 'n'
	| '\\' '\"'
	| '\\' '\''
	| '\\' '\\'
	;
	

Open in new window

0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now