Solved

Java Lexical Analyzer

Posted on 2013-11-04
1
714 Views
Last Modified: 2013-11-05
I need assistance in wrapping my head around the development of a lexical analyzer in Java.  It has been a little bit since I have taken my last java course, however am fairly familiar with the language.  In the class I am taking we are working with BNF grammars and using recursive descent parsing to traverse these grammars and verify that a user input string is valid according to the rules of the grammar.  I have a decent pseudo code design for the grammar and have pieced most of it together, however the design of the lexical analyzer is eluding me to read the user input string.  
There are only 3 letters I am concerned with a, b and c.  I am trying to stay with the book here and declare the tokens separate as an enumerated type (i.e. {NONE, LETTER, ERROR, END_OF_FILE}).
My initial question:
Is this basic idea correct for the lexer?
•      Start with character NONE
•      Look at next character
    o          if character = a, b, or c
¿              Add character to lexeme
¿              Look at Next character
              •       Continue until end of user input string
    o          If there is no character
¿              Return lexeme
    o          If character != a, b, or c
¿              Return ERROR token
0
Comment
Question by:Autkast
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
1 Comment
 
LVL 27

Accepted Solution

by:
dpearson earned 500 total points
ID: 39623422
Yes that approach basically sounds right, although you don't actually say what the valid lexemes are.  It sounds like you're accepting strings of the form "[abc]+"?

The only part that looks a little odd is here:

>> If there is no character
>>             Return lexeme

Usually this would be "if the next character is white space, then return lexeme".  But there appears to be no white space in your character set, so maybe this is equivalent to what you have?

E.g. If you were parsing java, and the lexical analyzer was reading an identifier, it would normally terminate at the first character not in the valid set for the identifier:

int abc = 10 ;

while parsing 'abc' you want the lexical analyzer to stop at the ' ' (space) after 'abc' - because it's the first character not allowed within an identifier - and return the lexeme at that point.

Anyway, hope that helps,

Doug
0

Featured Post

Secure Your Active Directory - April 20, 2017

Active Directory plays a critical role in your company’s IT infrastructure and keeping it secure in today’s hacker-infested world is a must.
Microsoft published 300+ pages of guidance, but who has the time, money, and resources to implement? Register now to find an easier way.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Introduction Many of the most common information processing tasks require sorting data sets.  For example, you may want to find the largest or smallest value in a collection.  Or you may want to order the data set in numeric or alphabetical order. …
Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
Viewers learn about the “while” loop and how to utilize it correctly in Java. Additionally, viewers begin exploring how to include conditional statements within a while loop and avoid an endless loop. Define While Loop: Basic Example: Explanatio…
The viewer will learn how to implement Singleton Design Pattern in Java.

726 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question