Solved

replace content of certain NOT empty columns in a *.csv file by a defined string

Posted on 2016-10-17
5
15 Views
Last Modified: 2016-10-24
Dear experts,
I have to replace the content of certain columns, which is NOT empty, leaving attributes of reccords ,where the content was emtpy, empty  in a *.csv file by a defined string using a unix shell script on AIX.
e.g. CSV file before processing, replacing NON empty columns 2 and 4 by ***

11111,2222,3333,4444,5555
aaaaa,,ccc,dd,ee

now after replacing column 2 and 4 by ***:
11111,***,3333,***,5555
aaaaa,cccc,***,ee

I've posted a simliar question earlier, the solution for just replacing column 2 and 4 with *** was

perl -F"\35" -ane '$F[1]=$F[3]="***";print join("\35",@F)' input_file.csv

With a column separator being 'Group Separator':
http://www.theasciicode.com.ar/ascii-control-characters/group-separator-ascii-code-29.html

Now the question is how to update the above script, that it only replaces NON empty records columns by *** leaving already empty columns empty and not replacing by ***

Many thanks
0
Comment
Question by:mruff
  • 3
  • 2
5 Comments
 
LVL 76

Assisted Solution

by:arnold
arnold earned 500 total points
ID: 41847283
what is the relationship between the first row and the second row.
Or is it ok to replace 2 and 4 when the first column is numbers, but if it is alpha or alphanumeric, replace the third?
non empty, means that you have to split, check and then act accordingly.
In the current example, you are blindly replacing contents.

@array=split(/,/); #feeding line by line
test $array[1] and $array[3] for length, if 0 means there is nothing there, so you skip.
......
Must it be on a command line?
0
 

Author Comment

by:mruff
ID: 41847633
Hi LVL 76
I did not fully understand your question, the column to be checked for empty or content must always be the same regardless of potential content, may it be numeric or alpha or alphanumeric.
Yes, it must be unfortunately command line, with e.g. java that would be very easy.
I know that  the version
perl -F"\35" -ane '$F[1]=$F[3]="***";print join("\35",@F)' input_file.csv
replaces blindly, isn't there a solution with this version using regex?
e.g. having to possibility to evaluate $F[1] first, with a regex instead of directly assigning 'blindly' '***'
0
 
LVL 76

Expert Comment

by:arnold
ID: 41848679
you have in the example when the second and fourth columns are replaced, the first column is 111111.
IN the second row when the third column is replaced, it starts with aaaaa

Your current perl script is replacing whatever it finds in the positions outline with the new one.

perl has a match regex, but for that you would need to know the maximum number of columns in a line, or it is possible though I am not suffciently familiar to have multiple matches on the same line assigned to a match array that you will need to handle
the straight forward for me is to use @array=split(/,/) the array will have as many elements as needed based on the comma separated value. IT will not differentiate if there are commas enclosed by quotes......

So understanding the type of data you are processing is important to make sure not to ....
With the @array you can test the length($array[1]) column 2, and $array[3] column 4 in the one example, or $array[2] as column three under a different scenario.

Where is this CSV comming from, might it not be simpler to perform the task prior to the CSV export?
0
 

Accepted Solution

by:
mruff earned 0 total points
ID: 41850362
the solution was
awk -F";" 'BEGIN { OFS=";";}{ if ( $24 != "" ) $24="*PURGED*" ; print $0}' file1.dat > file2_.dat
to e.g. purge column 24 if it had content
0
 

Author Closing Comment

by:mruff
ID: 41856722
my comment was the final solution
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Introduction Regular patching is part of a system administrator's tasks. However, many patches require that the system be in single-user mode before they can be installed. A cluster patch in particular can take quite a while to apply if the machine…
Over the years I've spent many an hour playing on hardened, DMZ'd servers, with only a sub-set of the usual GNU toy's to keep me company; frequently I've needed to save and send log or data extracts from these server back to my PC, or to others, and…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now