Solved

awk, sed or regex to process csv file

Posted on 1998-01-02
4
428 Views
Last Modified: 2013-12-26
perl is not available to me and i have an excel-generated csv file that i need to process as follows:
  1.  remove all commas within double-quotes
      (but retain comma delimiters)
  2.  remove all double-quotes
  3.  replace all spaces in each field with @!%

sample record:
1,22222222,333,"company, Inc.",,444,USPS,company name,,123 elm st,Ste 800,Miami,FL,33131,


can anyone help with this?
0
Comment
Question by:luke_airig
  • 2
4 Comments
 

Author Comment

by:luke_airig
ID: 1295888
Edited text of question
0
 
LVL 84

Expert Comment

by:ozo
ID: 1295889
If your sed is POSIX conforming,
 s/\(.*\)/"\1"/; s/\("\([^"]*\)"\)*,*/\2/g; s/ /@!%/g
should produce
1,22222222,333,company@!%Inc.,,444,USPS,company@!%name,,123@!%elm@!%st,Ste@!%800,Miami,FL,33131,
from your sample record.
0
 

Accepted Solution

by:
ashishkh earned 50 total points
ID: 1295890
Try this script:
sed -e "s/\("\([^",]*\),\([^",]*\)"\)/\2\3/g; s/ /@!%/g" infile

0
 
LVL 84

Expert Comment

by:ozo
ID: 1295891
ashishkh, that works with a single quoted comma,
(assuming your shell can handle the -e quoting,
sed -e 's/\("\([^",]*\),\([^",]*\)"\)/\2\3/g; s/ /@!%/g'
or
sed -e 's/\("\([^",]*\),\([^",]*\)"\)/\2\3/g; s/ /@\!%/g'
may work better if not)
but not with multiple commas, as in
"a,b,c","d,e,f"

0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
no14 challenge 14 72
pre4 challenge 19 104
sumHeights  challenge 17 75
My project did see openJDK that I installed. What could be the problem 7 152
Introduction: Database storage, where is the exe actually on the disc? Playing a game selected randomly (how to generate random numbers).  Error trapping with try..catch to help the code run even if something goes wrong. Continuing from the seve…
Have you tried to learn about Unicode, UTF-8, and multibyte text encoding and all the articles are just too "academic" or too technical? This article aims to make the whole topic easy for just about anyone to understand.
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
Although Jacob Bernoulli (1654-1705) has been credited as the creator of "Binomial Distribution Table", Gottfried Leibniz (1646-1716) did his dissertation on the subject in 1666; Leibniz you may recall is the co-inventor of "Calculus" and beat Isaac…

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question