Solved

Java Lexical Analyzer

Posted on 2013-11-04
1
688 Views
Last Modified: 2013-11-05
I need assistance in wrapping my head around the development of a lexical analyzer in Java.  It has been a little bit since I have taken my last java course, however am fairly familiar with the language.  In the class I am taking we are working with BNF grammars and using recursive descent parsing to traverse these grammars and verify that a user input string is valid according to the rules of the grammar.  I have a decent pseudo code design for the grammar and have pieced most of it together, however the design of the lexical analyzer is eluding me to read the user input string.  
There are only 3 letters I am concerned with a, b and c.  I am trying to stay with the book here and declare the tokens separate as an enumerated type (i.e. {NONE, LETTER, ERROR, END_OF_FILE}).
My initial question:
Is this basic idea correct for the lexer?
•      Start with character NONE
•      Look at next character
    o          if character = a, b, or c
¿              Add character to lexeme
¿              Look at Next character
              •       Continue until end of user input string
    o          If there is no character
¿              Return lexeme
    o          If character != a, b, or c
¿              Return ERROR token
0
Comment
Question by:Autkast
1 Comment
 
LVL 26

Accepted Solution

by:
dpearson earned 500 total points
ID: 39623422
Yes that approach basically sounds right, although you don't actually say what the valid lexemes are.  It sounds like you're accepting strings of the form "[abc]+"?

The only part that looks a little odd is here:

>> If there is no character
>>             Return lexeme

Usually this would be "if the next character is white space, then return lexeme".  But there appears to be no white space in your character set, so maybe this is equivalent to what you have?

E.g. If you were parsing java, and the lexical analyzer was reading an identifier, it would normally terminate at the first character not in the valid set for the identifier:

int abc = 10 ;

while parsing 'abc' you want the lexical analyzer to stop at the ' ' (space) after 'abc' - because it's the first character not allowed within an identifier - and return the lexeme at that point.

Anyway, hope that helps,

Doug
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Python Assistance 7 73
mockito example issue 8 64
map related example 6 38
hibernate jars 4 4
Introduction This article discusses the Chain of Responsibility pattern, explaining What it is;Why it is; andHow it is At the end of this article, I hope you will be able to describe the use and benefits of Chain of Responsibility.  Backgrou…
Dependencies in Software Design In software development, the idea of dependencies (http://en.wikipedia.org/wiki/Coupling_%28computer_programming%29) is an issue of some importance. This article seeks to explain what dependencies are and where they …
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
The viewer will learn how to implement Singleton Design Pattern in Java.

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now