Solved

How to split columns with exceptions

Posted on 2008-10-13
5
458 Views
Last Modified: 2012-05-05
Hi,
   Can anyone help me with this problem:

How can I split CSV file in Java (into separate columns), using built-in regex functions (for efficiency), but ignore everything in double quotes? So:

1,2,3,"hello, world!"

will be treated as 4 columns, not 5.
0
Comment
Question by:Envoy2064
  • 3
  • 2
5 Comments
 
LVL 27

Expert Comment

by:ddrudik
ID: 22705134
If you won't have empty columns you could do something like:

import java.util.regex.Pattern;

import java.util.regex.Matcher;

class Module1{

  public static void main(String[] asd){

  String sourcestring = "source string to match with pattern";

  Pattern re = Pattern.compile("""[^""]*""|[^,]+");

  Matcher m = re.matcher(sourcestring);

    if(m.find()){

      for( int groupIdx = 0; groupIdx < m.groupCount(); groupIdx++ ){

        System.out.println( "[" + groupIdx + "] = " + m.group(groupIdx));

      }

    }

  }

}

Open in new window

0
 

Author Comment

by:Envoy2064
ID: 22705155
How about with empty columns?
0
 

Author Comment

by:Envoy2064
ID: 22705173
Please note the emphasis on efficient algorithms that uses as much system-optimized code as possible.
0
 
LVL 27

Accepted Solution

by:
ddrudik earned 250 total points
ID: 22734320
For that requirement (remember that you will need to split the file by line and pass each line to the regex function to get the column values):
import java.util.regex.Pattern;

import java.util.regex.Matcher;

class Module1{

  public static void main(String[] asd){

  String sourcestring = "source string to match with pattern";

  Pattern re = Pattern.compile("""[^""]*""|[^,]+|(?<=,)(?=,)|^(?=,)|(?<=,)$");

  Matcher m = re.matcher(sourcestring);

  Int mIdx = 0;

    while (m.find()){

      for( int groupIdx = 0; groupIdx < m.groupCount(); groupIdx++ ){

        System.out.println( "[" + mIdx + "][" + groupIdx + "] = " + m.group(groupIdx));

      }

      mIdx++;

    }

  }

}

Open in new window

0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22779434
Thanks for the question and the points.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SSRS - Suppress row when blank from 2nd dataset 2 42
JDeveloper 12c for 32 bit 4 71
Problem of RegEx to match the first occurence of 10 37
MySQL  on Tomcat 8 30
Introduction Java can be integrated with native programs using an interface called JNI(Java Native Interface). Native programs are programs which can directly run on the processor. JNI is simply a naming and calling convention so that the JVM (Java…
Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collectio…
Viewers learn how to read error messages and identify possible mistakes that could cause hours of frustration. Coding is as much about debugging your code as it is about writing it. Define Error Message: Line Numbers: Type of Error: Break Down…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now