Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

replace content of certain NOT empty columns in a *.csv file by a defined string

Posted on 2016-10-17
5
Medium Priority
?
39 Views
Last Modified: 2016-10-24
Dear experts,
I have to replace the content of certain columns, which is NOT empty, leaving attributes of reccords ,where the content was emtpy, empty  in a *.csv file by a defined string using a unix shell script on AIX.
e.g. CSV file before processing, replacing NON empty columns 2 and 4 by ***

11111,2222,3333,4444,5555
aaaaa,,ccc,dd,ee

now after replacing column 2 and 4 by ***:
11111,***,3333,***,5555
aaaaa,cccc,***,ee

I've posted a simliar question earlier, the solution for just replacing column 2 and 4 with *** was

perl -F"\35" -ane '$F[1]=$F[3]="***";print join("\35",@F)' input_file.csv

With a column separator being 'Group Separator':
http://www.theasciicode.com.ar/ascii-control-characters/group-separator-ascii-code-29.html

Now the question is how to update the above script, that it only replaces NON empty records columns by *** leaving already empty columns empty and not replacing by ***

Many thanks
0
Comment
Question by:mruff
  • 3
  • 2
5 Comments
 
LVL 81

Assisted Solution

by:arnold
arnold earned 1500 total points
ID: 41847283
what is the relationship between the first row and the second row.
Or is it ok to replace 2 and 4 when the first column is numbers, but if it is alpha or alphanumeric, replace the third?
non empty, means that you have to split, check and then act accordingly.
In the current example, you are blindly replacing contents.

@array=split(/,/); #feeding line by line
test $array[1] and $array[3] for length, if 0 means there is nothing there, so you skip.
......
Must it be on a command line?
0
 

Author Comment

by:mruff
ID: 41847633
Hi LVL 76
I did not fully understand your question, the column to be checked for empty or content must always be the same regardless of potential content, may it be numeric or alpha or alphanumeric.
Yes, it must be unfortunately command line, with e.g. java that would be very easy.
I know that  the version
perl -F"\35" -ane '$F[1]=$F[3]="***";print join("\35",@F)' input_file.csv
replaces blindly, isn't there a solution with this version using regex?
e.g. having to possibility to evaluate $F[1] first, with a regex instead of directly assigning 'blindly' '***'
0
 
LVL 81

Expert Comment

by:arnold
ID: 41848679
you have in the example when the second and fourth columns are replaced, the first column is 111111.
IN the second row when the third column is replaced, it starts with aaaaa

Your current perl script is replacing whatever it finds in the positions outline with the new one.

perl has a match regex, but for that you would need to know the maximum number of columns in a line, or it is possible though I am not suffciently familiar to have multiple matches on the same line assigned to a match array that you will need to handle
the straight forward for me is to use @array=split(/,/) the array will have as many elements as needed based on the comma separated value. IT will not differentiate if there are commas enclosed by quotes......

So understanding the type of data you are processing is important to make sure not to ....
With the @array you can test the length($array[1]) column 2, and $array[3] column 4 in the one example, or $array[2] as column three under a different scenario.

Where is this CSV comming from, might it not be simpler to perform the task prior to the CSV export?
0
 

Accepted Solution

by:
mruff earned 0 total points
ID: 41850362
the solution was
awk -F";" 'BEGIN { OFS=";";}{ if ( $24 != "" ) $24="*PURGED*" ; print $0}' file1.dat > file2_.dat
to e.g. purge column 24 if it had content
0
 

Author Closing Comment

by:mruff
ID: 41856722
my comment was the final solution
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
Every server (virtual or physical) needs a console: and the console can be provided through hardware directly connected, software for remote connections, local connections, through a KVM, etc. This document explains the different types of consol…
Learn several ways to interact with files and get file information from the bash shell. ls lists the contents of a directory: Using the -a flag displays hidden files: Using the -l flag formats the output in a long list: The file command gives us mor…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Suggested Courses

564 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question