Solved

How to split columns with exceptions

Posted on 2008-10-13
5
465 Views
Last Modified: 2012-05-05
Hi,
   Can anyone help me with this problem:

How can I split CSV file in Java (into separate columns), using built-in regex functions (for efficiency), but ignore everything in double quotes? So:

1,2,3,"hello, world!"

will be treated as 4 columns, not 5.
0
Comment
Question by:Envoy2064
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
5 Comments
 
LVL 27

Expert Comment

by:ddrudik
ID: 22705134
If you won't have empty columns you could do something like:

import java.util.regex.Pattern;
import java.util.regex.Matcher;
class Module1{
  public static void main(String[] asd){
  String sourcestring = "source string to match with pattern";
  Pattern re = Pattern.compile("""[^""]*""|[^,]+");
  Matcher m = re.matcher(sourcestring);
    if(m.find()){
      for( int groupIdx = 0; groupIdx < m.groupCount(); groupIdx++ ){
        System.out.println( "[" + groupIdx + "] = " + m.group(groupIdx));
      }
    }
  }
}

Open in new window

0
 

Author Comment

by:Envoy2064
ID: 22705155
How about with empty columns?
0
 

Author Comment

by:Envoy2064
ID: 22705173
Please note the emphasis on efficient algorithms that uses as much system-optimized code as possible.
0
 
LVL 27

Accepted Solution

by:
ddrudik earned 250 total points
ID: 22734320
For that requirement (remember that you will need to split the file by line and pass each line to the regex function to get the column values):
import java.util.regex.Pattern;
import java.util.regex.Matcher;
class Module1{
  public static void main(String[] asd){
  String sourcestring = "source string to match with pattern";
  Pattern re = Pattern.compile("""[^""]*""|[^,]+|(?<=,)(?=,)|^(?=,)|(?<=,)$");
  Matcher m = re.matcher(sourcestring);
  Int mIdx = 0;
    while (m.find()){
      for( int groupIdx = 0; groupIdx < m.groupCount(); groupIdx++ ){
        System.out.println( "[" + mIdx + "][" + groupIdx + "] = " + m.group(groupIdx));
      }
      mIdx++;
    }
  }
}

Open in new window

0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22779434
Thanks for the question and the points.
0

Featured Post

Salesforce Made Easy to Use

On-screen guidance at the moment of need enables you & your employees to focus on the core, you can now boost your adoption rates swiftly and simply with one easy tool.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

As most anyone who uses or has come across them can attest to, regular expressions (regex) are a complicated bit of magic. Packed so succinctly within their cryptic syntax lies a great deal of power. It's not the "take over the world" kind of power,…
Introduction This article is the last of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers our test design approach and then goes through a simple test case example, how …
This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.
This tutorial will introduce the viewer to VisualVM for the Java platform application. This video explains an example program and covers the Overview, Monitor, and Heap Dump tabs.
Suggested Courses

622 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question