asked on

Missing field and StringTokenizer

Well, I need to split comma delimited line like:
1,2,,4
and get '1' '2' '' '4' , i.e. I need an empty string returned if there are two subsequent delimiters, but nextToken method just skips that subsequent delimiters producing '1' '2' '4'.

Any ideas? Or should I forget about Tokenizer and write my own (which will be slowlier then Tokenizer I guess, so I'd like to stay with built-in Tokenizer methods)?

kanthonym

What about doing a substring search for the pattern ',,' and then breaking up into two strings, run the tokenizer on each and then concatenate your two results?

ASKER CERTIFIED SOLUTION

imladris

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Jim Cakalic

Exactly what I was thinking imladris. Although I was late to the party because I was writing a code fragment to demonstrate:

private String[] tokenize(String str, String delim) {
StringTokenizer tokenizer = new StringTokenizer(str, delim, true);
ArrayList list = new ArrayList();
boolean lastWasDelim = false;
while (tokenizer.hasMoreTokens()) {
String token = tokenizer.nextToken();
if (delim.indexOf(token) >= 0) {
// found a delimiter
if (lastWasDelim) {
// two or more consecutive delimiters means an empty token
list.add("");
}
lastWasDelim = true;
} else {
list.add(token);
lastWasDelim = false;
}
}
return (String[])list.toArray(new String[list.size()]);
}

Jim

Koka

ASKER

Well, yes, returntokens will do the trick with minimal effort. So, I give points to imladris, as he was the first to suggest it.
I suspect it will be only slower by a factor of 2 than 'pure' Tokenizer (as returning delimiters in fact doubles number of tokens to process), anyway my files are not so large to bother and there seems to be no better approach.
Thanks to Jim and Kanthonym too.