Solved

StreamTokenizer etc

Posted on 1998-06-23
9
417 Views
Last Modified: 2010-04-16
I have a couple questions regarding a small piece of code. (one method actually)
I am reading an ASCII file and putting it into a 2D Vector (vectors?)(vector of vectors?)(whatever)
A couple problems:
1)  I cant get the StreamTokenizer.TT_EOL to recognize an end of line ... the EOL is when i switch which vector im
adding elements to.. Im just using the DOS edit editor to create the file to test... and a program will write the ASCII file in the future...since the end of line is platform dependent what exactly do i need to pass it for java to recognize the TT_EOL?  or is something in my code messed up?
2)  Im not sure I understand how StreamTokenizer.wordChars() works... I know this is what characters are included in the tokens however how can i specify as many spaces there are between the delimiter.  If I have a line like
blahblah,blah    blah,blah
I would like to have it recognize 3 tokens (not 4) like it is currently doing
3) One last thing.. just for conversation sake really... the StreamTokenizer.nextToken() does not recognize a null token as a token (for example: a comma delimited line consisting of a:  ,, (comma-comma) is not recognized as a token so what i've had to do is create a "null" string so that java recognizes the token (which is what im doing in the TT_WORD.  Ive been told that this mimics strtok() in C.  I'm curious why the creators chose to do it that way.

I have this method in a public class called FileData() with a main method for testing... (which creates an instance of FileData and calls the method)


the code:

public void getSelData(String filename) throws FileNotFoundException  {
                try  {
                        File f= new File(filename);
                        FileReader file= new FileReader(f);
                        StreamTokenizer st= new StreamTokenizer(file);
                        st.wordChars(' ',' ');
                        st.whitespaceChars(',',',');
                        choices= new Vector();
                        Vector currentVect= new Vector();
                        choices.addElement(currentVect);
                        while (st.nextToken() != StreamTokenizer.TT_EOF)  {
                                switch(st.ttype)  {
                                        case StreamTokenizer.TT_EOL:
                                                System.out.println("gets here");
                                                currentVect=new Vector();
                                                choices.addElement(currentVect);
                                                break;
                                        case StreamTokenizer.TT_NUMBER:
                                                currentVect.addElement(Double.toString(st.nval));
                                                break;
                                        case StreamTokenizer.TT_WORD:
                                                if (st.sval.equals("null"))  
                                                        currentVect.addElement("");
                                                else currentVect.addElement(st.sval);
                                                break;
                                        default:  //its a char
                                                currentVect.addElement(String.valueOf((char) st.ttype));
                                        }
                                }
                        }
                catch (FileNotFoundException e)  {
                        System.out.println("ERROR: Can't Find File");
                        System.exit(0);
                        }
                catch (IOException e)  {
                        System.out.println("ERROR in StreamTokenizer");
                        System.exit(0);
                        }
                for (int i=0;i<choices.size();i++)  {
                        Vector current= (Vector) choices.elementAt(i);
                        for (int j=0;j<current.size();j++)  {
                                System.out.println(i + "[" + j + "] : " + current.elementAt(j));
                                }
                        }
                }

Question 1 is really the major problem so if someone can at least answer that one it would be great.
0
Comment
Question by:mbunkows
  • 5
  • 4
9 Comments
 
LVL 2

Expert Comment

by:aziz061097
ID: 1223570
The value is not returned by nextToken() unless eolIsSignificant(true) has been called.
0
 
LVL 2

Accepted Solution

by:
aziz061097 earned 80 total points
ID: 1223571

 wordChars(int low, int hi)

     This method causes this StreamTokenizer to treat characters in the specified range as characters that are part of a word token, or, in other   words, consider the characters to be alphabetic. A word token consists of a sequence of characters that begins with an alphabetic character and  is followed by zero or more numeric or alphabetic characters.
0
 

Author Comment

by:mbunkows
ID: 1223572
ARRGHH
here it is right in my documentation...eolisSignificant() ... Thanks for the pointer!

About the wordChar() ... it makes sense that they specify a range..and i suppose multiple calls to wordChar could do multiple ranges (without overwriting?)
but what if someone wanted to include one or more spaces in a word (maybe even a tab)
in my testing it works great if i include one space but if there is more than one space it separates it into separate tokens
for example: ( the two _ are actually two <space>s)
blah_ _ blah,blah,blah_ _blah,
the method would see 5 tokens instead of 3 (like i would assume)
Any thoughts on number 3? Am I the only one that thinks this is bizarre functionality?
Or am I thinking about it incorrectly?

aziz, Youve already earned the points #1 was really my main question. the other two I can work around.  I'll wait a day and then grade an A.  Youve already been a HUGE help!

0
Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

 

Author Comment

by:mbunkows
ID: 1223573
ack! I see my problem for #2:
and you hit it right on the head!
I had a file with tokens starting with digits (not alphabetic) and that just so happened to be the tokens that had multiple spaces in them
how you figured that out from the information i gave you ill never know but thanks again!

ill grade now cuz #3 was really just a curiousity
youve been tons of help, thanks
0
 
LVL 2

Expert Comment

by:aziz061097
ID: 1223574

What I understand from your question is blah_ _ blah,blah,blah_ _blah should have 3 tokens instead of 5 , this will occur only if _ (space) is not a Token Character ,

 use the method "ordinaryChar(int ch)"

 to set space as an ordinary character then the tokens are divided using the comma and not the space and you will get 3 tokens. Does this help ? or is your question more complex like for a single space you need it as token character and for double space it should not be a token character
0
 
LVL 2

Expert Comment

by:aziz061097
ID: 1223575
Please don't take it as an offence but you had promised 87 points , I remembered it because it was very odd no. But you gave only 8 points. Was it a mistake ? Anyway I don't mind .

mbunkows
                                                                              Date: Tuesday, June 23 1998 - 03:26PM PDT
    Status: Answered.This question is locked until mbunkows evaluates the answer.
    Points: 87 Points
0
 

Author Comment

by:mbunkows
ID: 1223576
it actually works the way i was expecting it to.
my only problem was that i started the "word" with a digit
and because of that it was separating the word into two words (one number and one word)
I had a file like:
10001  First,10002  Second,

so when you told me how StreamTokenizer defined a "TT_WORD" that solved the problem
(at least told me the source of it)
now i just need to combine the two tokens into a string (while leaving other numerical data alone) which im not sure how to do (I may use StringTokenizer and change the numeric tokens to Doubles by catching NumberFormatException for the Strings
(make sense?)
But at least I know how StreamTokenizer works now.
If you know of a better way (preferrably using StreamTokenizer) I would really appreciate a comment if you have a moment.

0
 

Author Comment

by:mbunkows
ID: 1223577
ACK!
87 got taken off my account!
hang on a second.. let me see if i can figure this out

0
 

Author Comment

by:mbunkows
ID: 1223578
I asked the question for a wierd number because i had 187 points left and wanted to get
back to an even number (So i guess I decided to give someone else an odd number hehehe)

well you earned at LEAST 87 so ill post another question for you
i think i had a points error go in my favor before so I guess im gonna be even
 ill post another question.
sorry about the mix up
and thanks for telling me.. i would have never known

0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
How to get all the API from website? 11 106
difference of if loops 23 62
Why doesn't this text field show up on my Applet frame? 2 20
collection output issue 9 39
Introduction Java can be integrated with native programs using an interface called JNI(Java Native Interface). Native programs are programs which can directly run on the processor. JNI is simply a naming and calling convention so that the JVM (Java…
Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
The viewer will learn how to implement Singleton Design Pattern in Java.

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question