How to delete a portion of a string(s)

grazal
grazal used Ask the Experts™
on
Say I have the following comma-delimited lines contained in a file:
669217900,UPS GROUND #Y36935,my name/5265173230,12820 McKinnon apt #8,Brea,CA
573484949,UPS Ground,your name/1343535644*,22677 8th street,LosAngeles,CA
345848960,3535454545455454,someone/1234556700,bldg 6 apt 2 Attn:Owner,Brea,CA
123439450, z23435454554,common name,3333 sandoval street, Fullerton, ca

I want to take out the slash (/) and any characters that follow on the third column of each line like the following:
/5265173230
/1343535644*
/1234556700

I want to rewrite the file and the output needs to be:
669217900,UPS GROUND #Y36935,my name,12820 McKinnon apt #8,Brea,CA
573484949,UPS Ground,your name,22677 8th street,LosAngeles,CA
345848960,3535454545455454,someone,bldg 6 apt 2 Attn:Owner,Brea,CA
123439450, z23435454554,common name,3333 sandoval street, Fullerton, ca

How do I write this in Perl....please advise
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
ozo
Most Valuable Expert 2014
Top Expert 2015

Commented:
s#/[^,]*##;
ozo
Most Valuable Expert 2014
Top Expert 2015

Commented:
perl -i.bak -pe 's#/[^,]*##' file
ozo's answer will work but I would tend to a less efficient but more exact regex.

perl -i.bak -pe 's{^(\d+,[^,]+,[^/,]+)/[^,]+(,.*)$}{$1$2}' file

This will explicitly only remove / followed by characters in the third field.
ozo
Most Valuable Expert 2014
Top Expert 2015

Commented:
The .* and $ seem unnecessary
And + instead of * won't match if the fields are empty,
which may or may not be what is desired.
You are correct, I could have made the regex simply:

perl -i.bak -pe 's{^(\d+,[^,]+,[^/,]+)/[^,]+,}{$1,}' file

Also, you are right - I'm assuming the first two fields and first part of the third field must have data.  If not, the +s should be replaced with *s.

Author

Commented:
wilcoxon,

curiously, i don't see the +s on the command you posted above?  (whereby you added this comment: "If not, the +s should be replaced with *s.")
ozo
Most Valuable Expert 2014
Top Expert 2015

Commented:
perl -i.bak -pe 's{^(\d+,[^,]+,[^/,]+)/[^,]+,}{$1,}' file
perl -i.bak -pe 's{^(\d*,[^,]*,[^/,]*)/[^,]*,}{$1,}' file

Author

Commented:
oz, thank you for your response!

And the above commands are one-liner only, but what I would prefer are the commands inside a perl script that would write each line of the output using a while loop, say:

open(my $file, "</tmp/file1.csv");
while (<$file>)
   { . . . . .
     . . . . .   }
close($file);

what would be the commands inside the brackets?
As a script, I'd structure it a little differently...

open(my $file, "</tmp/file1.csv");
while (<$file>) {
    if (m{^(\d+,[^,]+,[^/,]+)/[^,]+(,.*)}) { # or m{^(\d*,[^,]*,[^/,]*)/[^,]*(,.*)}
        print $1, $2;
    } else {
        print;
    }
}
close($file);

Open in new window

Author

Commented:
almost there...ohe thing wrong though;  there was one comma added after the 3rd column, for example:
669217900,UPS GROUND #Y36935,my name/5265173230,12820 McKinnon apt #8,Brea,CA
573484949,UPS Ground,your name/1343535644*,22677 8th street,LosAngeles,CA

became:
669217900,UPS GROUND #Y36935,my name,,12820 McKinnon apt #8,Brea,CA
573484949,UPS Ground,your name,,22677 8th street,LosAngeles,CA

results looking for should be:
669217900,UPS GROUND #Y36935,my name,12820 McKinnon apt #8,Brea,CA
573484949,UPS Ground,your name,22677 8th street,LosAngeles,CA

please advise....

Author

Commented:
oops.....i meant to say..."two things wrong though;  there was one space and one comma added after the 3rd column"

i alse meant to say, it became:
669217900,UPS GROUND #Y36935,my name,  ,12820 McKinnon apt #8,Brea,CA
573484949,UPS Ground,your name,  ,22677 8th street,LosAngeles,CA

Author

Commented:
figured it out!  
to get the right output, the print statement should be "print $1$2", not "print $1,  $2"


thank you so much!
There should be no difference between the below two lines:

print "$1$2";
print $1, $2;

Did you do the below instead (it's the mostly likely case I could think of that would result in a comma and space)?

print "$1, $2";

Author

Commented:
Yes, you're absolutely correct...I needed the line break, and when I added the double quotes and the \n,  it resulted with the extra space and the extra comma as in:  

print "$1, $2\n";

So, to get the output I wanted, I therefore changed it to:

print "$1$2\n";


Thank you again!
Very slightly more efficient would be:

print $1, $2, "\n";

In any case, glad I could help and you figured it out.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial