Solved

replace content of certain NOT empty columns in a *.csv file by a defined string

Posted on 2016-10-17
5
18 Views
Last Modified: 2016-10-24
Dear experts,
I have to replace the content of certain columns, which is NOT empty, leaving attributes of reccords ,where the content was emtpy, empty  in a *.csv file by a defined string using a unix shell script on AIX.
e.g. CSV file before processing, replacing NON empty columns 2 and 4 by ***

11111,2222,3333,4444,5555
aaaaa,,ccc,dd,ee

now after replacing column 2 and 4 by ***:
11111,***,3333,***,5555
aaaaa,cccc,***,ee

I've posted a simliar question earlier, the solution for just replacing column 2 and 4 with *** was

perl -F"\35" -ane '$F[1]=$F[3]="***";print join("\35",@F)' input_file.csv

With a column separator being 'Group Separator':
http://www.theasciicode.com.ar/ascii-control-characters/group-separator-ascii-code-29.html

Now the question is how to update the above script, that it only replaces NON empty records columns by *** leaving already empty columns empty and not replacing by ***

Many thanks
0
Comment
Question by:mruff
  • 3
  • 2
5 Comments
 
LVL 77

Assisted Solution

by:arnold
arnold earned 500 total points
ID: 41847283
what is the relationship between the first row and the second row.
Or is it ok to replace 2 and 4 when the first column is numbers, but if it is alpha or alphanumeric, replace the third?
non empty, means that you have to split, check and then act accordingly.
In the current example, you are blindly replacing contents.

@array=split(/,/); #feeding line by line
test $array[1] and $array[3] for length, if 0 means there is nothing there, so you skip.
......
Must it be on a command line?
0
 

Author Comment

by:mruff
ID: 41847633
Hi LVL 76
I did not fully understand your question, the column to be checked for empty or content must always be the same regardless of potential content, may it be numeric or alpha or alphanumeric.
Yes, it must be unfortunately command line, with e.g. java that would be very easy.
I know that  the version
perl -F"\35" -ane '$F[1]=$F[3]="***";print join("\35",@F)' input_file.csv
replaces blindly, isn't there a solution with this version using regex?
e.g. having to possibility to evaluate $F[1] first, with a regex instead of directly assigning 'blindly' '***'
0
 
LVL 77

Expert Comment

by:arnold
ID: 41848679
you have in the example when the second and fourth columns are replaced, the first column is 111111.
IN the second row when the third column is replaced, it starts with aaaaa

Your current perl script is replacing whatever it finds in the positions outline with the new one.

perl has a match regex, but for that you would need to know the maximum number of columns in a line, or it is possible though I am not suffciently familiar to have multiple matches on the same line assigned to a match array that you will need to handle
the straight forward for me is to use @array=split(/,/) the array will have as many elements as needed based on the comma separated value. IT will not differentiate if there are commas enclosed by quotes......

So understanding the type of data you are processing is important to make sure not to ....
With the @array you can test the length($array[1]) column 2, and $array[3] column 4 in the one example, or $array[2] as column three under a different scenario.

Where is this CSV comming from, might it not be simpler to perform the task prior to the CSV export?
0
 

Accepted Solution

by:
mruff earned 0 total points
ID: 41850362
the solution was
awk -F";" 'BEGIN { OFS=";";}{ if ( $24 != "" ) $24="*PURGED*" ; print $0}' file1.dat > file2_.dat
to e.g. purge column 24 if it had content
0
 

Author Closing Comment

by:mruff
ID: 41856722
my comment was the final solution
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Let's say you need to move the data of a file system from one partition to another. This generally involves dismounting the file system, backing it up to tapes, and restoring it to a new partition. You may also copy the file system from one place to…
Introduction Regular patching is part of a system administrator's tasks. However, many patches require that the system be in single-user mode before they can be installed. A cluster patch in particular can take quite a while to apply if the machine…
Learn how to get help with Linux/Unix bash shell commands. Use help to read help documents for built in bash shell commands.: Use man to interface with the online reference manuals for shell commands.: Use man to search man pages for unknown command…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now