We help IT Professionals succeed at work.

A doubt in JavaCC

The code for my Test.jj File :-
------------------------------

PARSER_BEGIN(Test)

public class Test
{
   public static void main(String args[])
   throws ParseException
   {
      Test t = new Test(System.in);
      try
      {
         t.Input();
      System.out.println("Grammar is OK...........");
      }
      catch(ParseException ex)
      {
         System.out.println("Grammatical Error........");
         System.out.println(ex.getMessage());
      }
   }
}
PARSER_END(Test)

SKIP :
{
 " "
| "\t"
| "\n"
| "\r"
}

TOKEN :
{
  <LEFT_SQR_BRACKET: "[">
| <RIGHT_SQR_BRACKET: "]">
| <LEFT_BRACE: "{">
| <RIGHT_BRACE: "}">
| <SEMICOLON: ";">
| <SCOPE_TOKEN: "APPLET">
| <USERS_SECTION_TOKEN: "USERS">
| <NAME_TOKEN: "NAME">
| <VALUE_TOKEN: ["a"-"z","A"-"Z","_","0"-"9","*"] (["a"-"z","A"-"Z","_","-","*","0"-"9","\\","~","`","!","@","#","$","%","^","&","+","=",";","(",")","{","}",",","[","]"," "])* >

//| <VALUE_TOKEN: ["a"-"z","A"-"Z","_","*"] (~["\\","/","|","\"",":","<",">","[","]","?"])* >

}

void Input() :
{
Token tok;
}
{
   <SCOPE_TOKEN>
   <LEFT_BRACE>
      <USERS_SECTION_TOKEN>
      <LEFT_BRACE>
         <NAME_TOKEN>
         <LEFT_SQR_BRACKET>
            tok = <VALUE_TOKEN>
         <RIGHT_SQR_BRACKET>
         <SEMICOLON>
      <RIGHT_BRACE>
   <RIGHT_BRACE>
   <EOF>
   {
      System.out.println("value = "+tok.image);
   }
}

When my Input File -1 (input.txt)looks like :-
---------------------------------------

APPLET{
   USERS{
      NAME[SampleText{}1234,567890~`   !@#$%^&*()-_+=\;];
   }
}


When I run at the command promt I am getting the following error :

cmd_Prompt> java Test < input.txt

Output:-
Grammatical Error..............
Encountered "APPLET{" at line 1, column 1.
Was expecting:
    "APPLET" ...

I know the cause of the error..... That is, I have defined a token called <LEFT_BRACE: "{">,
but I am specifying this brace in the <VALUE_TOKEN> also.

But the Parser works fine, if the brace - "{" in the first line of the inputfile (input.txt) is shifted to next line... that is if the input file looks as following

Input File-2 :-
------------

APPLET
{
   USERS
   {
      NAME
      [
         SampleText{}1234,567890~`   !@#$%^&*()-_+=\;
      ]
      ;
   }
}


If I remove the "{" character from the <VALUE_TOKEN> in the .jj file, then the parser will work OK for my first input file also.
 
My requirement is the <VALUE_TOKEN> should still accept the barace-"{" as input, and work fine with the first input file also.

Cheers,
Poorna.
Comment
Watch Question

Commented:
Here am I again. Hope you don't object that I try to help you another time.

Did you get warnings when you javacc'ed this source code? You're right, the problem could be with the character {, because it appears twice. Maybe your grammar is ambiguous and usually JavaCC warns about this.

If so, a lookahead helps you. A lookahead is a trick used by the parser to decide which way to take if there are more than one possibility to resolve the grammar. The parser just looks ahead a few token and uses the information to decide which rule to follow.

Please tell us about any warnings with your code. Thanks.

Author

Commented:
No I am not getting any type of warning messages.....

Commented:
Okay, I try to compile your example code and see what happens. Could you please post your parser options? Thanks.

Author

Commented:
No. I havn't set any options.......

Commented:
Ok, have some patience, please.

Author

Commented:
Okay man........ I will wait... No worries....

Commented:
Ok, have some patience, please.
Commented:
I am sorry I couldn't find the cause of this error. (It is taking me too much time and I have to work at the project at my office.)

But I can give you some pointers where to look for.

1. The java grammar has a similar problem. The dot "." is used both in floating point literals and in chaining (like in System.out.println();). But there is no need to separate the dot by white space in chaining (like System . out . println();). So, your problem just be solvable. I just don't have the time.

2. You should group your grammar in pieces. Like this:

Input() : {} {
  "APPLET" "{" Users() "}" <EOF>
}

Users() : {} {
  ... etc.
}

This gives very important information to the parser and might solve your problem. Because the "{" and the "[" (by the way, "[" has the same problem) are detected in different states and can be recognized better.

3. If the name has a special structure (and is not just a name), then you should parse it, too. I find it very unusual that you have special characters []{}.- etc. in names.

If you have further (small) questions just ask again, I will look whether I can help you.

Commented:
No comment has been added lately, so it's time to clean up this TA.
I will leave a recommendation in the Cleanup topic area that this question is:

- Points for dnoelpp  
 
Please leave any comments here within the next seven days.
 
PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!
 
Venabili
EE Cleanup Volunteer

Explore More ContentExplore courses, solutions, and other research materials related to this topic.