Manipulating CSV files (conditional editing of individual columns) with awk, sed, or other standard linux tools?

I have a whole slew of CSV files (originally Excel), trying to standardize them before importing into a database.  Two transformations/clean-ups have me stumped:

1.  Column 4 sometimes has a single value, or two values with the word "and" between them.  Trying to figure out how to either split this into two columns (if the word "and" is present), or create a blank column if not.  I.e.:

LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, FORD and TOYOTA should become
LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, FORD, TOYOTA

and

LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, HYUNDAI should become
LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, , HYUNDAI

2.  Column 5 has a date RANGE, (start date / end date) in a variety of formats; sometimes dd-dd mon yy, sometimes dd-mon-yy - dd-mon-yy, or (for 1-day ranges) just a single dd-mon-yy.  Want to split this into two columns, and for single-day ranges just copy the date to both columns.

Any help greatly appreciated!!!
simmoja3Asked:
Who is Participating?
 
louislietaerCommented:
tested solution for 1)

perl -pi -e 's/ and /,/g' yourfile
perl -pi -e 's/ AND /,/g' yourfile
perl -pi -e 's/,,/,/g' yourfile
0
 
louislietaerCommented:
for point 2) can you prodide a sample
0
 
simmoja3Author Commented:
Regarding your first... it seems like the third line would REMOVE the empty column, when in fact I need that empty column if there's no "AND" present.  Or am I misunderstanding (certainly possible!)?

For #2:

LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, FORD, TOYOTA, 20-24 MAR 09 becomes
LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, FORD, TOYOTA, 20 MAR 09, 24 MAR 09

LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, FORD, TOYOTA, 20 FEB 09 becomes
LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, FORD, TOYOTA, 20 FEB 09, 20 FEB 09

LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, FORD, TOYOTA, 20 FEB 09 - 04 MAR 09 becomes
LAST_NAME, FIRST_NAME, SSN_LAST_FOUR, FORD, TOYOTA, 20 FEB 09, 04 MAR 09

Thanks for the help!  Greatly appreciated.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
louislietaerCommented:
For point 1) you are right the for the thrid line, you need is a bit more tricky I will work on it.

Is it ok if the result look like :

LAST_NAME, FIRST_NAME, SSN_LAST_FOUR,HYUNDAI,
0
 
leonstrykerCommented:
Of course there is always the option of transforming the data in Excel itself. Let me know if you are interested in this solution.
0
 
leonstrykerCommented:
This question has been classified as abandoned and is being closed as part of the Cleanup Program. See my comment at the end of the question for more details.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.