Solved

Need Regular expression to pre process CSV file

Posted on 2008-10-23
6
424 Views
Last Modified: 2010-04-21
( PHP 5.2 / Apache / WinXP )
I need to clean up a CSV file line by line before I process it.
The file has been mangled by Excel.

I'm new to regular expressions so I need some help.
The CSV file is actually a .txt file which is tab separated.
What I need to do is remove any tabs, commas and single quotes that are contained within a double quote.
And also remove the double quote as well.
So I end up with a clean CSV line separated by tabs.

Example this;
C05110200      "Trish, Ruducheerry"      Cantonon      TH18312      1973/0726
Should become
C05110200      Trish Ruducheerry      Cantonon      TH18312      1973/0726

Please provide a code example
0
Comment
Question by:Matthew_Way
  • 4
  • 2
6 Comments
 
LVL 27

Expert Comment

by:ddrudik
Comment Utility
You mention single quotes ' but not double quotes " but your result example above shows the double quotes gone.
If you want single-quotes removed:
<?php
$str="C05110200\t\"Trish, \tRuducheerry\"\tCantonon\tTH18312\t1973/0726";
echo "<pre>$str";
$str=preg_replace_callback('/"[^"]*"/','repfunc',$str);
function repfunc($match){
  return preg_replace("/[\t,']/",'',$match[0]);
}
echo "<br>$str";
?>

If you want the double-quotes removed:
<?php
$str="C05110200\t\"Trish, \tRuducheerry\"\tCantonon\tTH18312\t1973/0726";
echo "<pre>$str";
$str=preg_replace_callback('/"[^"]*"/','repfunc',$str);
function repfunc($match){
  return preg_replace('/[\t,"]/','',$match[0]);
}
echo "<br>$str";
?>
0
 

Author Comment

by:Matthew_Way
Comment Utility
Okay let me reword

Remove all single and double quotes.
Remove tab character only if it appears within a double quote.
0
 
LVL 27

Expert Comment

by:ddrudik
Comment Utility
Please confirm that you want to remove all single and double quotes, regardless of where they appear in the text, such as:
C05110200      "Trish Ruducheerry's Name"      Cantonon's Test      TH18312      1973/0726

Also confirm that the single quotes or double quotes to be removed are not escaped in any way in the text, such as this:
C05110200      "the following is a \"quote\" that someone said"      Cantonon's Test      TH18312      1973/0726
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 27

Accepted Solution

by:
ddrudik earned 500 total points
Comment Utility
Consider this example:
<?php

$str="C05110200\t\"Trish, \tRuducheerry\"\tCantonon\tTH1'8312\t1973/07\"26";

echo "<pre>$str";

$str=preg_replace('/["\']/','',preg_replace_callback('/"[^"]+"/','repfunc',$str));

function repfunc($match){

  return preg_replace("/[\t,'\"]/",'',$match[0]);

}

echo "<br>$str";

// it may appear in your output that the last tab was deleted, but it's there:

echo '<br>'.preg_replace('/\t/',',',$str);

?>

Open in new window

0
 

Author Closing Comment

by:Matthew_Way
Comment Utility
Thank you v.much
0
 
LVL 27

Expert Comment

by:ddrudik
Comment Utility
Thanks for the question and the points.
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

I imagine that there are some, like me, who require a way of getting currency exchange rates for implementation in web project from time to time, so I thought I would share a solution that I have developed for this purpose. It turns out that Yaho…
Part of the Global Positioning System A geocode (https://developers.google.com/maps/documentation/geocoding/) is the major subset of a GPS coordinate (http://en.wikipedia.org/wiki/Global_Positioning_System), the other parts being the altitude and t…
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now