Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

ANTLR Grammar Problem

Posted on 2010-09-03
3
Medium Priority
?
1,287 Views
Last Modified: 2012-05-10
Hi,

I am currently trying to create a formula-evaluator using an ANTLR generated Java Parser. In oder to get a kick-start, I used a grammar from this site http://arcanecoder.blogspot.com/2008/04/using-antlr-to-create-excel-like.html
At first it seemed to work like a charm. Until one of my test-users created a false expression. In the Grammar-File I am using, the Or operation is represented by "||" a user entered a forumual using "OR" instead. Unfortunately my parser doesn't fire an exception but simply quits after parsing the first part of the formula. As an example

TRUE OR FALSE evaluates to TRUE
FALSE OR TRUE evaluates to FALSE

How can I change the grammar not to quit but to identify the formula as invalid and to fire an exception?


grammar SecurityRules;

options {
        backtrack = true;
        output=AST;
	ASTLabelType=CommonTree;
        k=2;
}

tokens {
	POS;
	NEG;
	CALL;
}

@header {package de.cware.cweb.security.service.parser;}
@lexer::header{package de.cware.cweb.security.service.parser;}

formula
	: (EQ!)? expression
	;

//The highest precedence expression is the most deeply nested
//Precedence ties are parsed left to right
//Expression starts with the lowest precedece rule
expression		
	: boolExpr
	;
boolExpr
	: concatExpr ((AND | OR | LT | LTEQ | GT | GTEQ | EQ | NOTEQ)^ concatExpr)*
	;
concatExpr
	: sumExpr (CONCAT^ sumExpr)*
	;
sumExpr
	: productExpr ((SUB | ADD)^ productExpr)*
	;
productExpr
	: expExpr ((DIV | MULT)^ expExpr)*
	;
expExpr
	: unaryOperation (EXP^ unaryOperation)*
	;
unaryOperation
	: NOT^ operand
	| ADD o=operand -> ^(POS $o)
	| SUB o=operand -> ^(NEG $o)
	| operand
	;
// the highest precedence rule uses operand
operand
	: literal 
	| functionExpr -> ^(CALL functionExpr)
	| percent
	| VARIABLE
	| LPAREN expression RPAREN -> ^(expression)
	;
functionExpr
	: FUNCNAME LPAREN! (expression (COMMA! expression)*)? RPAREN!
	;
literal
	: NUMBER 
	| STRING 
	| TRUE
	| FALSE
	;
percent
	: NUMBER PERCENT^
	;

STRING
	:
	'\"'
		( options {greedy=false;}
		: ESCAPE_SEQUENCE
		| ~'\\'
		)*
	'\"'
	|
	'\''
		( options {greedy=false;}
		: ESCAPE_SEQUENCE
		| ~'\\'
		)*
	'\''
	;
WHITESPACE
	: (' ' | '\n' | '\t' | '\r')+ {skip();};
TRUE
	: ('t'|'T')('r'|'R')('u'|'U')('e'|'E')
	;
FALSE
	: ('f'|'F')('a'|'A')('l'|'L')('s'|'S')('e'|'E')
	;
	
NOTEQ           : '!=';
LTEQ            : '<=';
GTEQ            : '>=';
AND		: '&&';
OR		: '||';
NOT		: '!';
EQ              : '=';
LT              : '<';
GT              : '>';

EXP             : '^';
MULT            : '*';
DIV             : '/';
ADD             : '+';
SUB             : '-';

CONCAT          : '&';

LPAREN          : '(';
RPAREN          : ')';
COMMA           : ',';
PERCENT         : '%';

VARIABLE
	: '[' ~('[' | ']')+ ']'
	;
FUNCNAME
	: (LETTER)+
	;
NUMBER
	: (DIGIT)+ ('.' (DIGIT)+)?
	;

fragment
LETTER 
	: ('a'..'z') | ('A'..'Z')
	;
fragment
DIGIT
	: ('0'..'9')
	;
fragment
ESCAPE_SEQUENCE
	: '\\' 't'
	| '\\' 'n'
	| '\\' '\"'
	| '\\' '\''
	| '\\' '\\'
	;

Open in new window

0
Comment
Question by:ChristoferDutz
  • 2
3 Comments
 
LVL 2

Expert Comment

by:arch-itect
ID: 33603298
Have you noticed that both the TRUE and FALSE constructs contain ('e'|'E')

If that is not the only problem, could you post their test formula?
0
 
LVL 2

Expert Comment

by:arch-itect
ID: 33603314
ROFL, never mind
0
 
LVL 20

Accepted Solution

by:
ChristoferDutz earned 0 total points
ID: 33603927
;-) ... well both have an "e" in them, dont' they? ;-)

In the meanwhile I solved the problem myself. The problem was that the formula didn't state that after the expression nothing is allowed to follow. By changing:

formula
      : (EQ!)? expression
      ;

To:

formula
      : (EQ!)? expression EOF
      ;

Made the parser treat the input as invalid and after overriding the recovery-code by eavil-exception-throwing-code, I got the results I needed.

grammar SecurityRules;

options {
        backtrack = true;
        output=AST;
	ASTLabelType=CommonTree;
        k=2;
}

tokens {
	POS;
	NEG;
	CALL;
}

@header {package de.cware.cweb.services.evaluator.parser;}
@lexer::header{package de.cware.cweb.services.evaluator.parser;}

@parser::members {

  @Override
  protected Object recoverFromMismatchedToken(IntStream input, int ttype, BitSet follow) throws RecognitionException {
    throw new MismatchedTokenException(ttype, input);
  }

  @Override
  public Object recoverFromMismatchedSet(IntStream input, RecognitionException e, BitSet follow) throws RecognitionException {
    throw e;
  }

}

@rulecatch {
    catch (RecognitionException e) {
        throw e;
    }
}

@lexer::members {
    @Override
    public void reportError(RecognitionException e) {
        throw new RuntimeException(e);
    }

}    

formula
	: (EQ!)? expression EOF
	;

//The highest precedence expression is the most deeply nested
//Precedence ties are parsed left to right
//Expression starts with the lowest precedece rule
expression		
	: boolExpr
	;
boolExpr
	: concatExpr ((AND | OR | LT | LTEQ | GT | GTEQ | EQ | NOTEQ)^ concatExpr)*
	;
concatExpr
	: sumExpr (CONCAT^ sumExpr)*
	;
sumExpr
	: productExpr ((SUB | ADD)^ productExpr)*
	;
productExpr
	: expExpr ((DIV | MULT)^ expExpr)*
	;
expExpr
	: unaryOperation (EXP^ unaryOperation)*
	;
unaryOperation
	: NOT^ operand
	| ADD o=operand -> ^(POS $o)
	| SUB o=operand -> ^(NEG $o)
	| operand
	;
// the highest precedence rule uses operand
operand
	: literal 
	| functionExpr -> ^(CALL functionExpr)
	| percent
	| VARIABLE
	| LPAREN expression RPAREN -> ^(expression)
	;
functionExpr
	: FUNCNAME LPAREN! (expression (COMMA! expression)*)? RPAREN!
	;
literal
	: NUMBER 
	| STRING 
	| TRUE
	| FALSE
	;
percent
	: NUMBER PERCENT^
	;

STRING
	:
	'\"'
		( options {greedy=false;}
		: ESCAPE_SEQUENCE
		| ~'\\'
		)*
	'\"'
	|
	'\''
		( options {greedy=false;}
		: ESCAPE_SEQUENCE
		| ~'\\'
		)*
	'\''
	;
WHITESPACE
	: (' ' | '\n' | '\t' | '\r')+ {skip();};
TRUE
	: ('t'|'T')('r'|'R')('u'|'U')('e'|'E')
	;
FALSE
	: ('f'|'F')('a'|'A')('l'|'L')('s'|'S')('e'|'E')
	;
	
NOTEQ           : '!=';
LTEQ            : '<=';
GTEQ            : '>=';
AND		: '&&';
OR		: '||';
NOT		: '!';
EQ              : '=';
LT              : '<';
GT              : '>';

EXP             : '^';
MULT            : '*';
DIV             : '/';
ADD             : '+';
SUB             : '-';

CONCAT          : '&';

LPAREN          : '(';
RPAREN          : ')';
COMMA           : ',';
PERCENT         : '%';

VARIABLE
	: '[' ~('[' | ']')+ ']'
	;
FUNCNAME
	: (LETTER)+
	;
NUMBER
	: (DIGIT)+ ('.' (DIGIT)+)?
	;

fragment
LETTER 
	: ('a'..'z') | ('A'..'Z')
	;
fragment
DIGIT
	: ('0'..'9')
	;
fragment
ESCAPE_SEQUENCE
	: '\\' 't'
	| '\\' 'n'
	| '\\' '\"'
	| '\\' '\''
	| '\\' '\\'
	;
	

Open in new window

0

Featured Post

[Webinar] Database Backup and Recovery

Does your company store data on premises, off site, in the cloud, or a combination of these? If you answered “yes”, you need a data backup recovery plan that fits each and every platform. Watch now as as Percona teaches us how to build agile data backup recovery plan.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When there is a disconnect between the intentions of their creator and the recipient, when algorithms go awry, they can have disastrous consequences.
What do responsible coders do? They don't take detrimental shortcuts. They do take reasonable security precautions, create important automation, implement sufficient logging, fix things they break, and care about users.
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
Six Sigma Control Plans
Suggested Courses

564 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question