Solved

Is there any tool to generate a Parser for different syntaxes

Posted on 2004-04-20
21
364 Views
Last Modified: 2010-04-15
I want to know if there is any such tool so that a user can specify what kind of syntax he needs(his own choice of keywords)  and the tool generates the grammar for that syntax.

For e.g if the user specifies that indentation is required then the tool should be able to incorporate that restriction , if user enters "no" indentation then the tool should remove that restriction.
0
Comment
Question by:tvkalyan
  • 7
  • 6
  • 4
  • +1
21 Comments
 
LVL 10

Expert Comment

by:Mercantilum
Comment Utility
Yes, there are the famous Lex and Yacc, now known as Flex and Bison.

A bit of theory and bison/flex: http://cs.wwc.edu/~aabyan/PLBook/HTML/Translation.html

Download tools http://www.monmouth.com/~wstreett/lex-yacc/lex-yacc.html
0
 
LVL 46

Expert Comment

by:Sjef Bosman
Comment Utility
It depends on what you want. If you want to parse a language like C, it's indeed best to use Yacc and Lex. If it is Pascal-like, using Yacc and Lex is very difficult (the language is Left-recursive). It will take someone unfamiliar with L/Y about 3 months to understand their full complexity. It'll take him 1 month for a reasonable example of a parser/translator to complete.

What is it that you want? Is it a complex grammar?
0
 
LVL 10

Expert Comment

by:Mercantilum
Comment Utility
Even if you don't plan to actually use flex and bison to make your language, the two links deserve some interest anyway.

By the way, an easy approach to help the making of the syntax of the language, is the BNF notation:

   http://cui.unige.ch/db-research/Enseignement/analyseinfo/AboutBNF.html

It is an approach on how you can formally define your own language syntax ; after this step, it's easier to consider grammar...
0
 
LVL 46

Expert Comment

by:Sjef Bosman
Comment Utility
It might be an idea to set up a separate TA for parser/compiler development. Maybe you can dump your syntax here?
0
 
LVL 11

Expert Comment

by:avizit
Comment Utility
while there are lex and yacc and their clones flex, bison , byacc they are not the only tools available for generating parsers/scanners


there are also

1. ANTLR ( formerly PCCTS)  http://antlr.org/

2. cppcc : http://cppcc.sourceforge.net/  ( if anyone has used this , I would be interested to get feedback regarding it )
 
3. The GENTLE Compiler Construction System http://www.first.gmd.de/gentle/ ( the website seems to be down at the monet )

there are more ..

/abhijit/

0
 

Author Comment

by:tvkalyan
Comment Utility
This was my question. Are there any tools that write the grammar for yacc and the regular expressions for lex so that a compiler designer doesn't need to write the grammar and the regular expressions himself. The tool should itself create the yacc file and the lex file by taking some simple input from the user.
0
 
LVL 11

Accepted Solution

by:
avizit earned 125 total points
Comment Utility
I might be wrong here but I dont think such tools exists ..

if such tools exists the input to such has to be just another form of the grammar /  or the lexical specificatoin.

so it would just convert one format of a lex/yacc file to another.

but as i said I might be wrong here

/abhijit/
0
 
LVL 46

Expert Comment

by:Sjef Bosman
Comment Utility
What you're asking for is like: I have the asembly language translator, is there another program that will create assembler for me? In this case: yes there is, use C (or lots of other languages). I'd say that Lex/Yacc are not very user-friendly, but any other tool, with the same capabilities, would be equally user-unfriendly. How are you going to tell this tool what are the operators, the keywords, the variables, the syntax, and how is the lot to be compiled into machine-runnable code? I also think that easier tools than Lex/Yacc are still to be invented.
0
 

Author Comment

by:tvkalyan
Comment Utility
Forget assembly language , let's say I am converting some pseudo-code  language(high level) into C. The input to the tool which (inturn creates a yacc and a lex file) can just be in plain english.

For e.g
what are the keywords that you require?
How would you want the IF structure to be?

Once the answers to these questions is obtained , the tool should create a yacc file and a lex file. An important observation here is that the pseudo-code(the source language) is simple in complexity and the IF structure,FOR structure,DO-WHILE structure,WHILE structure is similar to the C language( If you think about it these structures are same in languages like C,C++,Java, etc) which means that the grammar for these structures is more or less the same(static).

If such a tool exists then every user can create his own compiler to his liking and use the compiler. Obviously there are limitations with this tool (the grammar for IF,etc should be static) but atleast one can customize his compiler by just answering some simple questions. (And the limitations that I talk about aren't any serious ones cause the structures are the same in many languages like I said in the previous paragraph).

How difficult is it to design a tool like that?
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 46

Expert Comment

by:Sjef Bosman
Comment Utility
So you want your user to fill in some kind of form or so? I can imagine this would be very handy, but I've never seen such a tool (yet).

I don't know what you know of compiler design (I have to go delve in my memory to find something), but there might be some misunderstanding about what a compiler is and what a parser is. The average compiler consists out of several parts:
- the lexical analyzer (the easy part)
- the parser (that's what YACC is for, it accepts a certain grammar)
- error correction mechanism (??)
- code generation (???)
- additional libraries (????)

You aren't there yet with only Lex and Yacc! If you need a complete compiler, then writing the piece for code generation is usually the most time-consuming part. In time you will need lots of libraries, or do you plan to use the standard C libraries? What code are you intending to generate?

If you're serious about this, read some literature, like Principles of Compiler Design (ahem, I have the 1977 version :$) and there must be better ones. Just to sum it all up: a parser is no more than a function that either accepts or refuses a "program" written in the language the parser is meant to accept.
0
 
LVL 11

Expert Comment

by:avizit
Comment Utility
>>The input to the tool which (inturn creates a yacc and a lex file) can just be in plain english.

English is not precise enough for description( or we can wait for artificial intelligence / natural language processing to pick up )

For e.g
>>what are the keywords that you require?

Don't you think "regular expressions" are good enough for this purpose i.e, to define what keywords we need.  Thats what you input to lex. I really doubt you can express keywords any simpler than just writing it down .. which is exactly what is done for a lex input file


>>How would you want the IF structure to be?

from yacc file for ansi c grammar

 IF '(' expression ')' statement ELSE statement

I guess that's a simple enough description for a IF statement.

---

It should be possible to build say a GUI tool , where you enter the regular expressions/ keywords etc and it generates the lex files for you .


/abhijit/


0
 
LVL 11

Expert Comment

by:avizit
Comment Utility
>> The tool should itself create the yacc file and the lex file by taking some simple input from the user. <<

what input from the user are you talking bout ?
like i said for keyword it cant get any simpler than just writing it down which is exactly what its regular exoression is , which is accpeted by lex.



/abhijit/
0
 

Author Comment

by:tvkalyan
Comment Utility
To sjef_bosman,

I know that there are a lot more stages for a compiler other than lexical analysis and yacc. But what I meant was that Once the pseudo-code is compiled into C code the C compiler will take it from there. So I am not bothered about error-handling,code- generation. As far as the additional libraries are concerned, I am talking about the pseudo-code which is simple in complexity(say a language for beginners). It it were as complex as C then one would better be off learning C.

To  avizit,

That's about lex , what about yacc. What about someone who wants a simple compiler but who doesn't know how to write the grammar( and there are a lot of other things, once the grammar is written , he has to face the shift/reduce conflicts ,etc of yacc). I am talking about a tool which writes the grammar(static) without any conflicts and most importantly can also accomodate some simple syntax changes(dynamic) so that the user need not rewrite the entire grammar.

Simple syntax changes include :

1. One may choose a separate begin and end for each compound statement and the other may choose the same begin and end for every compound statement, etc.
0
 
LVL 11

Expert Comment

by:avizit
Comment Utility
>>That's about lex , what about yacc. What about someone who wants a simple compiler but who doesn't know how to write the grammar( and there are a lot of other things, once the grammar is written , he has to face the shift/reduce conflicts ,etc of yacc).<<


Firs of all , the grammar is something which you " the compiler writer" will provide. The tool will only produce a parser according to the grammar that the CW (compiler writer ) provides.  So the CW _has_ to provide the grammar. So the tools that you are envisioning will only do a conversion from one format of grammar to another . When you say " someone who wants a simple compiler but who doesn't know how to write the grammar"  I hope you mean he/she doesn't know the syntax of the yacc files well. If that's what you meant , I guess a simple GUI tool can be made which will ask user general questions and hen create the yacc files ( note the CW still has to provide the grammar , no tools/machine can guess the grammar for you , unless you bring AI into the picture)

>>I am talking about a tool which writes the grammar(static) without any conflicts <<

conflicts are cases where the tool is not able to decide which action to take and the CW really has to help out. Even then default action is the correct one most of the time  e.g for the 'danglig if then else'conflict the default action is the one intended inmost language .


>>most importantly can also accomodate some simple syntax changes(dynamic) so that the user need not rewrite the entire grammar.<<

for simple changes you dont need to modify the whole grammar , you just have to edit the relevant portions of the yacc input file

( maybe we are talking in a different wavelength here :)  but do tell me if i am talking way off the topic than what your intention was )

/abhijit/

0
 
LVL 11

Expert Comment

by:avizit
Comment Utility
>>Simple syntax changes include :

1. One may choose a separate begin and end for each compound statement and the other may choose the same begin and end for every compound statement, etc.
<<
...
compound_statement
      : '{' '}'
      | '{' statement_list '}'
      | '{' declaration_list '}'
      | '{' declaration_list statement_list '}'
..
the above is a part of the ansi c grammar I got from a website
for example to change  the begin and end from '{' and '}' to "BEGIN" and "END" you change

the grammar to
compound_statement
      : BEGIN  END
      | BEGIN statement_list END
      | BEGIN declaration_list END
      | BEGIN declaration_list statement_list BEGIN

you dont need to rewrite the whole grammar to do that
/abhijit/


0
 
LVL 46

Expert Comment

by:Sjef Bosman
Comment Utility
What do you want to hear? That the people who wrote Lex and Yacc were fools? Because they wrote programs that are too difficult to handle? Isn't that maybe because writing a compiler (whatever the output of the compiler is) for a proper language is a complicated job, even when you have tools like Yex and Lacc?

Read Principles of Compiler Design, by Aho and Ullman, and stop pushing! Come back when you read the book. I'll be happy to answer your questions then, but for now, I've had it. Over and Out.
0
 

Author Comment

by:tvkalyan
Comment Utility
abhijit,

>>When you say " someone who wants a simple compiler but who doesn't know how to write the grammar"  I hope you mean he/she doesn't know the syntax of the yacc files well. If that's what you meant , I guess a simple GUI tool can be made which will ask user general questions and hen create the yacc files ( note the CW still has to provide the grammar , no tools/machine can guess the grammar for you , unless you bring AI into the picture)<<

I did mean "someone who doesn't know the syntax of the yacc files". I think we are in the same wavelength here but one addition though; The example that I gave for different begins and ends was just one such example but if the user chooses some set of options first and different set of options next then there might be more changes to the grammar and i guess one might appreciate the tool that I am talking about. What do you think? Also what I meant by different begins and ends was that the user chooses to have a

beginif;endif for "if"
beginfor;endfor for "for"
beginwhile;endwhile for "while",etc

or he might say
just endif for "if" since the if_statement signals the begining of the if_statement anyway.

I know that the above changes are simple too but the grammar that you wrote is going to change. I was talking about many such changes. You are right when you say that relevant portions of the yacc file can be edited and I guess that is what I am talking about so that the user doesn't need to go make the changes everytime different options are chosen.

To sjef bosman

I don't know what your problem is. The tool that I am talking about uses Lex and Yacc. How can I say the people who wrote Lex and Yacc are fools. I have been noticing you always say Read "Compiler Design" . For your information I read the book and I am pretty sure I "dont" have to go delve in my memory to find something. I guess you are one of those geeks who can only discuss what was mentioned in a particular book. If the book written by Aho and Ullman said "A tool like that is possible" then I guess you will be the first person to agree with them. Good for you. And Dont get me wrong here(this is to all the other guys not "sjef bosman" cause I know he'll misinterpret this too) , I am not against the book. Infact I think it is one of the best books on "Compiler design".

0
 
LVL 11

Expert Comment

by:avizit
Comment Utility
Okay for a person who doesn't know the syntax of writing yacc files , he/she still has to provide the grammar and that would just mean converting the grammar from one format to another. And I believe the format of yacc is simple enough to make it pointless to go for _yet another format_.

also as I said a very simple gui based tool can be written to point out ( suggest even ) locations where changes might be required similarto ones you mention like different 'begins' and 'ends' . This can be based on the grep command too.


/abhijit/

0
 
LVL 46

Expert Comment

by:Sjef Bosman
Comment Utility
I don't have a problem, I just couldn't grasp the extent of your project. I think that it is impossible to use Yacc. Changing the language rules will lead to different semantics and a different compiler. I can imagine however that you have a GUI tool for lexical changes, like Pascal-addicts used to do, but then with the help of the C-preprocessor:

#define IF if(
#define THEN ) {
#define ELSE } else {
#define ENDIF }

etc.

In this example, the ENDIF doesn't exist in Pascal but cannot be left out, so a user will never be completely free when making changes in the lexical tokens.

What it comes down to is to have a GUI Lex-tool that can generate a "tokenizer", and generates C as some sort of C-preprocessor. This (I think) is doable, but implementing a more or less complete language where production rules can be altered is too difficult, at least for me to understand. Mind you, 20 years ago, a compiler-compiler was a revolution, maybe you're on the verge of another one.

I hope that I, despite my rather blunt words, have added some value to the discussion.
0

Featured Post

Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

Join & Write a Comment

Summary: This tutorial covers some basics of pointer, pointer arithmetic and function pointer. What is a pointer: A pointer is a variable which holds an address. This address might be address of another variable/address of devices/address of fu…
Windows programmers of the C/C++ variety, how many of you realise that since Window 9x Microsoft has been lying to you about what constitutes Unicode (http://en.wikipedia.org/wiki/Unicode)? They will have you believe that Unicode requires you to use…
The goal of this video is to provide viewers with basic examples to understand and use pointers in the C programming language.
Video by: Grant
The goal of this video is to provide viewers with basic examples to understand and use while-loops in the C programming language.

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now