Koka
asked on
Missing field and StringTokenizer
Well, I need to split comma delimited line like:
1,2,,4
and get '1' '2' '' '4' , i.e. I need an empty string returned if there are two subsequent delimiters, but nextToken method just skips that subsequent delimiters producing '1' '2' '4'.
Any ideas? Or should I forget about Tokenizer and write my own (which will be slowlier then Tokenizer I guess, so I'd like to stay with built-in Tokenizer methods)?
1,2,,4
and get '1' '2' '' '4' , i.e. I need an empty string returned if there are two subsequent delimiters, but nextToken method just skips that subsequent delimiters producing '1' '2' '4'.
Any ideas? Or should I forget about Tokenizer and write my own (which will be slowlier then Tokenizer I guess, so I'd like to stay with built-in Tokenizer methods)?
What about doing a substring search for the pattern ',,' and then breaking up into two strings, run the tokenizer on each and then concatenate your two results?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Exactly what I was thinking imladris. Although I was late to the party because I was writing a code fragment to demonstrate:
private String[] tokenize(String str, String delim) {
StringTokenizer tokenizer = new StringTokenizer(str, delim, true);
ArrayList list = new ArrayList();
boolean lastWasDelim = false;
while (tokenizer.hasMoreTokens() ) {
String token = tokenizer.nextToken();
if (delim.indexOf(token) >= 0) {
// found a delimiter
if (lastWasDelim) {
// two or more consecutive delimiters means an empty token
list.add("");
}
lastWasDelim = true;
} else {
list.add(token);
lastWasDelim = false;
}
}
return (String[])list.toArray(new String[list.size()]);
}
Jim
private String[] tokenize(String str, String delim) {
StringTokenizer tokenizer = new StringTokenizer(str, delim, true);
ArrayList list = new ArrayList();
boolean lastWasDelim = false;
while (tokenizer.hasMoreTokens()
String token = tokenizer.nextToken();
if (delim.indexOf(token) >= 0) {
// found a delimiter
if (lastWasDelim) {
// two or more consecutive delimiters means an empty token
list.add("");
}
lastWasDelim = true;
} else {
list.add(token);
lastWasDelim = false;
}
}
return (String[])list.toArray(new
}
Jim
ASKER
Well, yes, returntokens will do the trick with minimal effort. So, I give points to imladris, as he was the first to suggest it.
I suspect it will be only slower by a factor of 2 than 'pure' Tokenizer (as returning delimiters in fact doubles number of tokens to process), anyway my files are not so large to bother and there seems to be no better approach.
Thanks to Jim and Kanthonym too.
I suspect it will be only slower by a factor of 2 than 'pure' Tokenizer (as returning delimiters in fact doubles number of tokens to process), anyway my files are not so large to bother and there seems to be no better approach.
Thanks to Jim and Kanthonym too.