Solved

Java Lexical Analyzer

Posted on 2013-11-04
1
683 Views
Last Modified: 2013-11-05
I need assistance in wrapping my head around the development of a lexical analyzer in Java.  It has been a little bit since I have taken my last java course, however am fairly familiar with the language.  In the class I am taking we are working with BNF grammars and using recursive descent parsing to traverse these grammars and verify that a user input string is valid according to the rules of the grammar.  I have a decent pseudo code design for the grammar and have pieced most of it together, however the design of the lexical analyzer is eluding me to read the user input string.  
There are only 3 letters I am concerned with a, b and c.  I am trying to stay with the book here and declare the tokens separate as an enumerated type (i.e. {NONE, LETTER, ERROR, END_OF_FILE}).
My initial question:
Is this basic idea correct for the lexer?
•      Start with character NONE
•      Look at next character
    o          if character = a, b, or c
¿              Add character to lexeme
¿              Look at Next character
              •       Continue until end of user input string
    o          If there is no character
¿              Return lexeme
    o          If character != a, b, or c
¿              Return ERROR token
0
Comment
Question by:Autkast
1 Comment
 
LVL 26

Accepted Solution

by:
dpearson earned 500 total points
ID: 39623422
Yes that approach basically sounds right, although you don't actually say what the valid lexemes are.  It sounds like you're accepting strings of the form "[abc]+"?

The only part that looks a little odd is here:

>> If there is no character
>>             Return lexeme

Usually this would be "if the next character is white space, then return lexeme".  But there appears to be no white space in your character set, so maybe this is equivalent to what you have?

E.g. If you were parsing java, and the lexical analyzer was reading an identifier, it would normally terminate at the first character not in the valid set for the identifier:

int abc = 10 ;

while parsing 'abc' you want the lexical analyzer to stop at the ' ' (space) after 'abc' - because it's the first character not allowed within an identifier - and return the lexeme at that point.

Anyway, hope that helps,

Doug
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
array6 challenfge 6 64
Exception after setting jdbc session management 2 38
HashMap Vs TreeMap 12 49
Math Question 1 54
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
Viewers will learn about basic arrays, how to declare them, and how to use them. Introduction and definition: Declare an array and cover the syntax of declaring them: Initialize every index in the created array: Example/Features of a basic arr…
The viewer will learn how to implement Singleton Design Pattern in Java.

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now