We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now

x

StreamTokenizer etc

mbunkows
mbunkows asked
on
Medium Priority
446 Views
Last Modified: 2010-04-16
I have a couple questions regarding a small piece of code. (one method actually)
I am reading an ASCII file and putting it into a 2D Vector (vectors?)(vector of vectors?)(whatever)
A couple problems:
1)  I cant get the StreamTokenizer.TT_EOL to recognize an end of line ... the EOL is when i switch which vector im
adding elements to.. Im just using the DOS edit editor to create the file to test... and a program will write the ASCII file in the future...since the end of line is platform dependent what exactly do i need to pass it for java to recognize the TT_EOL?  or is something in my code messed up?
2)  Im not sure I understand how StreamTokenizer.wordChars() works... I know this is what characters are included in the tokens however how can i specify as many spaces there are between the delimiter.  If I have a line like
blahblah,blah    blah,blah
I would like to have it recognize 3 tokens (not 4) like it is currently doing
3) One last thing.. just for conversation sake really... the StreamTokenizer.nextToken() does not recognize a null token as a token (for example: a comma delimited line consisting of a:  ,, (comma-comma) is not recognized as a token so what i've had to do is create a "null" string so that java recognizes the token (which is what im doing in the TT_WORD.  Ive been told that this mimics strtok() in C.  I'm curious why the creators chose to do it that way.

I have this method in a public class called FileData() with a main method for testing... (which creates an instance of FileData and calls the method)


the code:

public void getSelData(String filename) throws FileNotFoundException  {
                try  {
                        File f= new File(filename);
                        FileReader file= new FileReader(f);
                        StreamTokenizer st= new StreamTokenizer(file);
                        st.wordChars(' ',' ');
                        st.whitespaceChars(',',',');
                        choices= new Vector();
                        Vector currentVect= new Vector();
                        choices.addElement(currentVect);
                        while (st.nextToken() != StreamTokenizer.TT_EOF)  {
                                switch(st.ttype)  {
                                        case StreamTokenizer.TT_EOL:
                                                System.out.println("gets here");
                                                currentVect=new Vector();
                                                choices.addElement(currentVect);
                                                break;
                                        case StreamTokenizer.TT_NUMBER:
                                                currentVect.addElement(Double.toString(st.nval));
                                                break;
                                        case StreamTokenizer.TT_WORD:
                                                if (st.sval.equals("null"))  
                                                        currentVect.addElement("");
                                                else currentVect.addElement(st.sval);
                                                break;
                                        default:  //its a char
                                                currentVect.addElement(String.valueOf((char) st.ttype));
                                        }
                                }
                        }
                catch (FileNotFoundException e)  {
                        System.out.println("ERROR: Can't Find File");
                        System.exit(0);
                        }
                catch (IOException e)  {
                        System.out.println("ERROR in StreamTokenizer");
                        System.exit(0);
                        }
                for (int i=0;i<choices.size();i++)  {
                        Vector current= (Vector) choices.elementAt(i);
                        for (int j=0;j<current.size();j++)  {
                                System.out.println(i + "[" + j + "] : " + current.elementAt(j));
                                }
                        }
                }

Question 1 is really the major problem so if someone can at least answer that one it would be great.
Comment
Watch Question

The value is not returned by nextToken() unless eolIsSignificant(true) has been called.
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview

Author

Commented:
ARRGHH
here it is right in my documentation...eolisSignificant() ... Thanks for the pointer!

About the wordChar() ... it makes sense that they specify a range..and i suppose multiple calls to wordChar could do multiple ranges (without overwriting?)
but what if someone wanted to include one or more spaces in a word (maybe even a tab)
in my testing it works great if i include one space but if there is more than one space it separates it into separate tokens
for example: ( the two _ are actually two <space>s)
blah_ _ blah,blah,blah_ _blah,
the method would see 5 tokens instead of 3 (like i would assume)
Any thoughts on number 3? Am I the only one that thinks this is bizarre functionality?
Or am I thinking about it incorrectly?

aziz, Youve already earned the points #1 was really my main question. the other two I can work around.  I'll wait a day and then grade an A.  Youve already been a HUGE help!

Author

Commented:
ack! I see my problem for #2:
and you hit it right on the head!
I had a file with tokens starting with digits (not alphabetic) and that just so happened to be the tokens that had multiple spaces in them
how you figured that out from the information i gave you ill never know but thanks again!

ill grade now cuz #3 was really just a curiousity
youve been tons of help, thanks

What I understand from your question is blah_ _ blah,blah,blah_ _blah should have 3 tokens instead of 5 , this will occur only if _ (space) is not a Token Character ,

 use the method "ordinaryChar(int ch)"

 to set space as an ordinary character then the tokens are divided using the comma and not the space and you will get 3 tokens. Does this help ? or is your question more complex like for a single space you need it as token character and for double space it should not be a token character
Please don't take it as an offence but you had promised 87 points , I remembered it because it was very odd no. But you gave only 8 points. Was it a mistake ? Anyway I don't mind .

mbunkows
                                                                              Date: Tuesday, June 23 1998 - 03:26PM PDT
    Status: Answered.This question is locked until mbunkows evaluates the answer.
    Points: 87 Points

Author

Commented:
it actually works the way i was expecting it to.
my only problem was that i started the "word" with a digit
and because of that it was separating the word into two words (one number and one word)
I had a file like:
10001  First,10002  Second,

so when you told me how StreamTokenizer defined a "TT_WORD" that solved the problem
(at least told me the source of it)
now i just need to combine the two tokens into a string (while leaving other numerical data alone) which im not sure how to do (I may use StringTokenizer and change the numeric tokens to Doubles by catching NumberFormatException for the Strings
(make sense?)
But at least I know how StreamTokenizer works now.
If you know of a better way (preferrably using StreamTokenizer) I would really appreciate a comment if you have a moment.

Author

Commented:
ACK!
87 got taken off my account!
hang on a second.. let me see if i can figure this out

Author

Commented:
I asked the question for a wierd number because i had 187 points left and wanted to get
back to an even number (So i guess I decided to give someone else an odd number hehehe)

well you earned at LEAST 87 so ill post another question for you
i think i had a points error go in my favor before so I guess im gonna be even
 ill post another question.
sorry about the mix up
and thanks for telling me.. i would have never known

Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a free trial preview!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.