Link to home
Start Free TrialLog in
Avatar of shragi
shragiFlag for India

asked on

sql query using csv data

Hi -

I have a csv file like below

1000001, 20140101
1000002, 20110101
1000003, 20100101
...
...
1000023, 20040101

Now i wrote a sql query like below
select * from emp
where emp_id = '1000001' and jdate = '20110101'

this is for one member but is there a way to write a query for all the values in the csv.
if i use my query then i need to feed empid, jdate to the query and the DB is called lot of times.
but can I avoid it and make one single call for all the values in csv and still get output.

I am using perl script to call the stored procedure where the above query is written

THanks,
Avatar of wilcoxon
wilcoxon
Flag of United States of America image

Why not construct the query in perl (from the csv data) and call it directly?

There are a couple of ways you can write a query that may do what you want (either as direct queries or stored procedure).

The simplest is imperfect but might be sufficient (it all depends on your data and how exact the query needs to be):
select * from emp
where emp_id in ('1000001','1000002',...,'1000023') and jdate in ('20140101','20100101',...,'20040101')

Open in new window


The problem with the above is that it will return results where emp_id = 1000001 and jdate is any of 20140101, 20100101, or 20040101.  If that is acceptable then you're done.

If not, then you need to create something like this and may run into limits of your db (either inherent or settings):
select * from emp
where (emp_id = '1000001' and jdate = '20140101')
or (emp_id = '1000002' and jdate = '20110101')
...
or (emp_id = '1000023' and jdate = '20040101')

Open in new window

Avatar of shragi

ASKER

I knew that the part I don't know how to get data from CSV.
I think I did not ask my question properly
I would like to know how do you read from that csv and run the query.
How many rows are in the csv file?  If it has a small number of rows, then you could parse it and build up the where clause to look like:
where (emp_id = 1000001 and jdate = '20140101')
   or (emp_id = 1000002 and jdate = '20110101')
   or (emp_id = 1000003 and jdate = '20100101')

But, if it has a large number of rows, then that wouldn't be very practical/efficient so I'd either execute 1 query perl row (using placeholders/bind values) or build the where clause as shown but limit the number of groups to a reasonable number and do the query in batches.
SOLUTION
Avatar of FishMonger
FishMonger
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Is the file in Microsoft CSV format or plain vanilla CSV format?  In other words, is it bare "words" that are comma separated or does it include additional formatting options (most commonly quotes around any strings)?

If it is vanilla/plain CSV, FishMonger's code should work.  If it is Microsoft format, then you have to either do more processing or (better) use Text::CSV.
Avatar of shragi

ASKER

@FishMonger

I got an error using the above perl script
"use of uninitialized value $date in concatenation"
That would seem to indicate that there are lines that do not contain jdate (either just emp_id or possibly an empty line).  You can add this line between lines 4 and 5 in FishMonger's script:
next unless (defined $id and defined $date);

Open in new window


That will silently skip those lines.  You could alter it to warn you about them and then skip them.
I'm not receiving that warning.  What changes did you make to the code I posted?

c:\test>type Perl-1.pl
#!/usr/bin/perl

use strict;
use warnings;

my @where;
while (<DATA>) {
    chomp;
    my ($id, $date) = split /, /;
    push @where, "(emp_id = $id and jdate = '$date')";
}

my $where = join ' or ', @where;
my $sql = "SELECT * FROM emp WHERE $where";

print $sql;

__DATA__
1000001, 20140101
1000002, 20110101
1000003, 20100101

Open in new window


c:\test>Perl-1.pl
SELECT * FROM emp WHERE (emp_id = 1000001 and jdate = '20140101') or (emp_id = 1000002 and jdate = '20110101') or (emp_id = 1000003 and jdate = '20100101')

Open in new window

Avatar of shragi

ASKER

let me give the complete error message

"use of uninitialized value $date in concatenation
SELECT * FROM emp where (emp_id = '100001,20150101' and jdate = )

so bothe emp_id and jdate went to emp_id
so it is not properly split

my date file is as below:
100001,20150101
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of shragi

ASKER

that worked thanks
Ah, I should have expected that the sample data was not formatted exactly the same as the actual data.