Solved

Question for a Regex in split

Posted on 1998-09-23
4
212 Views
Last Modified: 2012-05-04
Hi,

can anyone tell me a regex which I can use to split
lines where fields are separated by a ; if this is
not within two paired double quotes?

Example:

Field 1;Field 2;"Field 3a;Field 3b";"Field 4a;Field 4b"

I want to get:

Field 1
Field 2
Field 3a;Field 3b
Field 4a;Field 4b

I have to use split as it is not known to me how many
fields there are and where a semicolon is included in
double quotes.

Thanks for your help,
Kai.
0
Comment
Question by:kaijen
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
4 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 1204955
@fields=/((?:"[^"]*"|[^;])+)/g;
0
 
LVL 84

Expert Comment

by:ozo
ID: 1204956
perldoc -q split
Found in perlfaq4.pod

How can I split a [character] delimited string except when inside [character]? (Comma-separated files)

Take the example case of trying to split a string that is comma-separated
into its different fields.  (We'll pretend you said comma-separated, not
comma-delimited, which is different and almost never what you mean.) You
can't use C<split(/,/)> because you shouldn't split if the comma is inside
quotes.  For example, take a data line like this:

    SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"

Due to the restriction of the quotes, this is a fairly complex
problem.  Thankfully, we have Jeffrey Friedl, author of a highly
recommended book on regular expressions, to handle these for us.  He
suggests (assuming your string is contained in $text):

     @new = ();
     push(@new, $+) while $text =~ m{
         "([^\"\\]*(?:\\.[^\"\\]*)*)",?  # groups the phrase inside the quotes
       | ([^,]+),?
       | ,
     }gx;
     push(@new, undef) if substr($text,-1,1) eq ',';

If you want to represent quotation marks inside a
quotation-mark-delimited field, escape them with backslashes (eg,
C<"like \"this\"">.  Unescaping them is a task addressed earlier in
this section.

Alternatively, the Text::ParseWords module (part of the standard perl
distribution) lets you say:

    use Text::ParseWords;
    @new = quotewords(",", 0, $text);

0
 

Author Comment

by:kaijen
ID: 1204957
Thanks alot!

This works. I only have to work out how to strip the
doublequotes. But this sounds like a good homework.

Please state this as an answer and I'll give you the
points!

Best regards,
Kai.
0
 
LVL 84

Accepted Solution

by:
ozo earned 200 total points
ID: 1204958
# If the quotes are not part of the field, does that mean that you'll never see anything like
# Field 1;;"Field" "3a;" "Field 3b";Field 4a";"Field 4b
#If not then something like this may work for you:
@fields = grep length,split /;|"([^"]*)"/;
#or
push @fields,$+  while /(?:"([^"]*)|([^;]+))[^;]*;?/g;
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

710 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question