Link to home
Start Free TrialLog in
Avatar of jazzIIIlove
jazzIIIloveFlag for Sweden

asked on

Java regex need for csv

Hi;

I have csv file which in Java, I split the file with , delimiter and it works fine. For this, I use split function in Java

Now the data that I am working is changed:

“ID”, “FOO”, “BAZ”
“001”, “AAA”, “aaaaa”
“002”, “CC, DD”, “ccccc”

In my implementation, I need to treat CC,DD as a single data and skip delimitation for this.

Since data size is big, I have a feeling a regex would be necessary to fix this. Can you help me on this regex?

Br.
Avatar of CEHJ
CEHJ
Flag of United Kingdom of Great Britain and Northern Ireland image

, I have a feeling a regex would be necessary to fix this.

No, that's not the tool for the job. Use a proper CSV library, e.g. OpenCSV or Ostermiller CSV
Avatar of jazzIIIlove

ASKER

HI CEHJ,

Thanks for the remark but I am not allowed to use a 3rd party library. I still think that it is the tool for the job.

Br.
OK. Be my guest in writing the regex for it it then. I'd like to see it.

Plus - you're already using a 3rd party library (see my comments about log4j)
Hi CEHJ;

Thanks for your remark. I need to correct that log4j is accepted by my professor but not other 3rd libraries.

I am struggling on the regex. So far it is as follows but some issues. I appreciate your remarks on the regex.

        String line = “002”, “CC, DD”, “ccccc”;
        String[] tokens = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1);
        for(String t : tokens) {
            System.out.println("> "+t);

Open in new window


http://stackoverflow.com/questions/1757065/java-splitting-a-comma-separated-string-but-ignoring-commas-in-quotes

Any help on it?

Br.
Well i've already given my remarks on regex. You will also, btw, have to allow for csv escaped characters such as \". That's the job of a csv parser
ASKER CERTIFIED SOLUTION
Avatar of mccarl
mccarl
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Well it wouldn't handle escaped quotes of course. A more likely scenario (an empty final field) will also break it
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Also, are there really spaces after the commas?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks for the inputs. I will update you today.

Best regards.