Link to home
Start Free TrialLog in
Avatar of trevor1940
trevor1940

asked on

quoting a comma separated list

Hi

I have a comma separated list that has 1 or more elements

something like  

my $list = "One,Two,Three";
or
my $list = "One";

Open in new window


when  printed I need $list to look like this

'One','Two','Three'

Open in new window


eventually it will form part of a sql query thus

my $sql = "select * from table where stuff IN ($list)"; ## only to illustrate

Open in new window


what is the quickest way of doing this?
Avatar of dda
dda
Flag of Russian Federation image

Something like this:

my $list = "One,Two,Three";
my @list = split /,/, $list;
s/(.+)/"$1"/ foreach (@list);
my $formatted = join ',', @list;

Open in new window

SOLUTION
Avatar of tel2
tel2
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks tel2,  you are right, I used wrong quotes in the output.
Totally forgivable, dda.  A minor mistake.
Greetings from NZ to Russia.
Avatar of trevor1940
trevor1940

ASKER

@tel2

I had thought of using substitution wasn't sure if this  good practice?

  s/.+/'$&'/ for @list;
my $formatted = join ',', @list;

print "$formatted [$&] \n"; 

outputs 

'One','Two','Three' []

Open in new window


having never seen $& I googled

This link suggests

Never use $&, except maybe when golfing, or on a one-liner where efficiency or good style is not an issue.

In your example $& only exists in the for loop so I'm assuming it's OK?


Usage of $& etc. imposes an overhead on all pattern matches globally. You don't want that.

dose this mean it's not local? ie not confined to inside A loop if or sub?
Hi trevor,

> "I had thought of using substitution wasn't sure if this good practice?"
I assume you're talking about the substitution in the code that you included above (i.e. dda's solution that I modified slightly), rather than the substitution in my own solution, are you?

Either way, sorry - I hadn't heard of that stuff about $&, and I don't know the answers to your questions, but thanks for telling me.
But yes, I expect $& is just not visible later because it's local to the for loop, as you suggest.  Same problem occurs with $1.  It is visible later if you do a substitution without a loop.
Just use this not quite so abbreviated version of that line, if you want to play safe:
        s/(.+)/'$1'/ for @list;

But did you see my own solution, which is a single line of code which doesn't use arrays?  It's at the top of my first post.  Any concerns with that option?
Why are you using a scalar to hold a coma separated list of values?  You should be using an array.

The methods suggested so far will have problems if there are quotes already within the string.  A better approach would be to use DBI's quote method.

my $string = join q{,}, map $dbh->quote($_), @list;

Open in new window

or
my $string = join q{,}, map $dbh->quote($_), split /,/, $list;

Open in new window


An even better approach (to prevent sql injection) would be to use placeholders and pass the list in the execute statement instead of the prepare statement.
Fair points FishMonger.  Depends on the source and possible content of the data.

Trevor,
I found this about $&:
 "WARNING: If your code is to run on Perl 5.16 or earlier, beware that once Perl sees that you need one of $& , $` , or $' anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program."
The above and more info on that issue can be found here: http://perldoc.perl.org/perlre.html

Regarding the globalness of $& (and even $1), I think this code:
$var = "Outer";
$var =~ /.(.+)/;
print "1st=$1\nAll=$&\n";
{
        $var = "Inner";
        $var =~ /.(.+)/;
        print "1st=$1\nAll=$&\n";
}
print "1st=$1\nAll=$&\n";

Open in new window

which produces this output:
1st=uter
All=Outer
1st=nner
All=Inner
1st=uter
All=Outer

Open in new window

proves $& and $1 are both local to their block, so I don't think that is the point being made when the guy said:
   "Usage of $& etc. imposes an overhead on all pattern matches globally."
It sounds as if he's just referring to what is being said in the "WARNING" I pasted above.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yes, FishMonger, if it's an environment that is prone to SQL injection.

But I don't think your line 8 is going to work.  Have you tested it?  Is this the kind of thing you meant?:
    my $sql  = 'select * from table where stuff IN (' . join(',', '?' x @list) . ')';
You're right, I didn't test it.  You're adjusted version is correct.

In production scripts I add additional vertical and horizontal whitespace to make it more readable and maintainable.
my $sql  = "select *
            from table
            where stuff IN (" . join(',', ('?') x @list) . ")";

Open in new window

I might even adjust that a little more.
What is the need to have:
    ('?') x @list
instead of just:
    '?' x @list
The parens are needed to put it into list context.  Without them the where clause would be:
where stuff IN (???)
instead the required
where stuff IN (?,?,?)

If you want more readable, add a space to the join statement.
join(', ', ('?') x @list)
OK - thanks.
@tel2

> "I had thought of using substitution wasn't sure if this good practice?"
I assume you're talking about the substitution in the code.............

I wasn't actually I was referring to your simple solution

    my $list = "One,Two,Three";
    ($list = "'" . $list . "'") =~ s/,/','/g;

Open in new window


In the current context would probably be good enough as the list elements only consists of 3 figures from a know source so no possibility of SQL injection  however I think fishmonger solution to use placeholders and pass the list in the execute statement is better practice I didn't know you could do that

Thanx for the info on the use of "$1" & "$&"
I guess it would be better practice, trevor, except in situations where you know your source data will never have the issues which require the extra complexity.
I wouldn't use the word "except".  The approach I suggested IS the better practice in either case, but using the regex approach is acceptable if you know that the data will always be coming from a known/trusted source and format.

I have several scripts I wrote years ago where the input data was from and expected to always be from a trusted source so I used acceptable but not best practices when parsing that data.  But over time things changed and after awhile the building of the input data was farmed out to a 3rd party and could no longer be trusted in the same way and I started to have random failures which took awhile to troubleshoot due to the data source assumption and acceptable but not best practice code.
I used the word "except", in the context of my sentence, which included:
     "...where you know your source data will never have the issues which require the extra complexity".

If you don't "know" it (which is more than just "expecting" it), then it might be best to go for the more complex option.

But if you do know it, or are happy with the risks, and you just want "the quickest way of doing this" (as was specified in the original post), then simpler options may be appropriate.

(I "know" that there are technically few things we can 100% "know" in this life, but let's just say I'm talking about knowing it beyond reasonable doubt, or to a level which is appropriate for the application.)
Thanx for your help and the explanations

Hope the point share is fare
Thanks for the points, trevor.

Personally I think dda's answer was worth some points, but it's up to you.