Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people, just like you, are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
Solved

Need Regular expression to pre process CSV file

Posted on 2008-10-23
6
456 Views
Last Modified: 2010-04-21
( PHP 5.2 / Apache / WinXP )
I need to clean up a CSV file line by line before I process it.
The file has been mangled by Excel.

I'm new to regular expressions so I need some help.
The CSV file is actually a .txt file which is tab separated.
What I need to do is remove any tabs, commas and single quotes that are contained within a double quote.
And also remove the double quote as well.
So I end up with a clean CSV line separated by tabs.

Example this;
C05110200      "Trish, Ruducheerry"      Cantonon      TH18312      1973/0726
Should become
C05110200      Trish Ruducheerry      Cantonon      TH18312      1973/0726

Please provide a code example
0
Comment
Question by:Matthew_Way
  • 4
  • 2
6 Comments
 
LVL 27

Expert Comment

by:ddrudik
ID: 22793417
You mention single quotes ' but not double quotes " but your result example above shows the double quotes gone.
If you want single-quotes removed:
<?php
$str="C05110200\t\"Trish, \tRuducheerry\"\tCantonon\tTH18312\t1973/0726";
echo "<pre>$str";
$str=preg_replace_callback('/"[^"]*"/','repfunc',$str);
function repfunc($match){
  return preg_replace("/[\t,']/",'',$match[0]);
}
echo "<br>$str";
?>

If you want the double-quotes removed:
<?php
$str="C05110200\t\"Trish, \tRuducheerry\"\tCantonon\tTH18312\t1973/0726";
echo "<pre>$str";
$str=preg_replace_callback('/"[^"]*"/','repfunc',$str);
function repfunc($match){
  return preg_replace('/[\t,"]/','',$match[0]);
}
echo "<br>$str";
?>
0
 

Author Comment

by:Matthew_Way
ID: 22793445
Okay let me reword

Remove all single and double quotes.
Remove tab character only if it appears within a double quote.
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22793477
Please confirm that you want to remove all single and double quotes, regardless of where they appear in the text, such as:
C05110200      "Trish Ruducheerry's Name"      Cantonon's Test      TH18312      1973/0726

Also confirm that the single quotes or double quotes to be removed are not escaped in any way in the text, such as this:
C05110200      "the following is a \"quote\" that someone said"      Cantonon's Test      TH18312      1973/0726
0
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

 
LVL 27

Accepted Solution

by:
ddrudik earned 500 total points
ID: 22793522
Consider this example:
<?php
$str="C05110200\t\"Trish, \tRuducheerry\"\tCantonon\tTH1'8312\t1973/07\"26";
echo "<pre>$str";
$str=preg_replace('/["\']/','',preg_replace_callback('/"[^"]+"/','repfunc',$str));
function repfunc($match){
  return preg_replace("/[\t,'\"]/",'',$match[0]);
}
echo "<br>$str";
// it may appear in your output that the last tab was deleted, but it's there:
echo '<br>'.preg_replace('/\t/',',',$str);
?>

Open in new window

0
 

Author Closing Comment

by:Matthew_Way
ID: 31509504
Thank you v.much
0
 
LVL 27

Expert Comment

by:ddrudik
ID: 22802979
Thanks for the question and the points.
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

This article discusses how to create an extensible mechanism for linked drop downs.
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to count occurrences of each item in an array.

789 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question