Koka
asked on
Bulk inserts via JDBC
I have a Java app that continuosly inserts rows into a single table. I need to mimick the SQL loader conventional path load functionality in the following manner:
1. Use bulk inserts in a single call (just to speed it up)
2. Put all rejected rows into separate "BAD" file
When I insert them one-by-one all is fine and I have all set up, but I do not know how to insert rows in batches and get back info which rows were rejected to put them into bad file. Is there any method of batching inserts and getting back Array of error-codes or something similar through JDBC?
Thanks
1. Use bulk inserts in a single call (just to speed it up)
2. Put all rejected rows into separate "BAD" file
When I insert them one-by-one all is fine and I have all set up, but I do not know how to insert rows in batches and get back info which rows were rejected to put them into bad file. Is there any method of batching inserts and getting back Array of error-codes or something similar through JDBC?
Thanks
ASKER
At the moment I'm executing CallableStatement (the called PL|SQL procedure actually consists only of insert statement) in a loop, with a try-commit catch-rollback blocks, closing statement/connection in final block . In fact perfoming commits,say, after 100 rows does not improve performance. But in this case, in the full analogy to your proposal data for each insert is passed to server in a single network call for EACH row, and I hope to improve performance with passing data for 100 rows in a single network call (e.g. pass array(100) of something and insert those 100 rows (bulk bind?) in a procedure that returns errors(100) array).
I just wonder what is the best or recommended design, if there is such. Thanks
I just wonder what is the best or recommended design, if there is such. Thanks
Oh. So you'll be making some sort of PL/SQL block with
text like:
BEGIN
INSERT INTO M ( c1, c2 ) VALUES ( 'v1','v2');
INSERT INTO M ( c1, c2 ) VALUES ( 'v3','v4');
INSERT INTO M ( c1, c2 ) VALUES ( 'v5','v6');
INSERT INTO M ( c1, c2 ) VALUES ( 'v7','v8');
END;
?
Interesting idea. I have no idea if it will actually be any faster. Maybe it depends on if your network is slow.
text like:
BEGIN
INSERT INTO M ( c1, c2 ) VALUES ( 'v1','v2');
INSERT INTO M ( c1, c2 ) VALUES ( 'v3','v4');
INSERT INTO M ( c1, c2 ) VALUES ( 'v5','v6');
INSERT INTO M ( c1, c2 ) VALUES ( 'v7','v8');
END;
?
Interesting idea. I have no idea if it will actually be any faster. Maybe it depends on if your network is slow.
ASKER
oh no, if I pass arguments as arrays I can use
FORALL statement to use bulk binds in my procedure. That definitely will be faster, but it seems that proceeding this way is nearly equivalent to custom batch insert implementation through JAVA, and it seems unbelievable that Oracle has not implemented batch iserts.
And if Oracle has not done it already, that means there's some caveat in such approach. Or is there? That is the question :)
If not the mess with necessity of returning that varay containing error codes I'd try it w/o asking here, but you know, I'm a bit lazy :), so I'm waiting for someone to show me better or easier design for inserting tens of thousands calls into a busy database through JDBC.
I'm increasing points to 300, lazy people pay more :)
FORALL statement to use bulk binds in my procedure. That definitely will be faster, but it seems that proceeding this way is nearly equivalent to custom batch insert implementation through JAVA, and it seems unbelievable that Oracle has not implemented batch iserts.
And if Oracle has not done it already, that means there's some caveat in such approach. Or is there? That is the question :)
If not the mess with necessity of returning that varay containing error codes I'd try it w/o asking here, but you know, I'm a bit lazy :), so I'm waiting for someone to show me better or easier design for inserting tens of thousands calls into a busy database through JDBC.
I'm increasing points to 300, lazy people pay more :)
ASKER
lazy people SHOULD pay more
I'm not familiar with the FORALL pl/sql stuff. PreparedStatements with re-binding is how I'd approach the problem. Or, I'd beg my boss to go back to using SQLLoader, which really is a cool program ;-)
ASKER
Yes, SQLLoader is cool - does your 'PreparedStatements with re-binding' perform like loader conv path? I'm almost sure it is noticably slower :).
That's why I'm sure there is a way to achieve similar performance - I do not want to write generated records to file and run loader through shell, extra disk IO is always costy.
That's why I'm sure there is a way to achieve similar performance - I do not want to write generated records to file and run loader through shell, extra disk IO is always costy.
hi,
"I have a Java app that continuosly inserts rows into a single table. I need to mimick the SQL loader
conventional path load functionality in the following manner:
1. Use bulk inserts in a single call (just to speed it up)
2. Put all rejected rows into separate "BAD" file
I do not know how to insert rows
in batches and get back info which rows were rejected to put them into bad file. Is there any method
of batching inserts and getting back Array of error-codes or something similar through JDBC?"
************************** ********** ********** ********** **
SQL*LOADER is Oracle's high speed, bulk data loader, a command line program, not an API, cannot be called from a stored procedure.
It operates in two different modes:
a) Conventional Path - employs SQL INSERTS to load data.
b) Direct Path - does NOT use SQL, formats database blocks directly
The Direct Path load allows you to read data from a flat file, and write it directly to formatted database blocks, bypassing the entire SQL engine (and rollback and redo at the same time). When used in parallel, direct path load is the fastest known way to load data to Oracle.
You have 2 choices:
A. OCI allows you to write your own direct path loader, a mini-SQLLDR in JAVA.
B. Use JAVA to run SQLLDR from the Operating System via an external proecure.
option B is easy, so here it is:
create or replace and compile java source named "Util"
as
import java.io.*;
import java.lang.*;
public class Util extends Object
{
public static int RunThis(String[] args)
{
Runtime rt = Runtime.getRuntime();
int rc = -1;
try
{
process p = rt.exec(args[0]);
int bufSize = 4096;
BufferedInputStream bis =
new BufferedInputStream(p.getI nputStream (), bufSize);
int len;
byte buffer[] = new byte[bufSize];
// Echo back program output
while ((len = bis.read(buffer, 0, bufSize)) != -1)
System.out.write(buffer, 0, len);
rc = p.waitFor();
}
catch (Exception e)
{
e.printStackTrace();
rc = -1;
}
finally
{
return rc;
}
}
}
/
Java Created.
This is setup to run any program, and capture the output to either a trace file on the server, or, if DBMS_JAVA is used, to the DBMS_OUTPUT buffer.
Any command can be run as the "Oracle" software account using this, were it not for the privileges needed (J2ES). So grant permissions:
sqlplus>begin
dbms_java.grant_permission
( USER,
'java.io.FilePermission',
'/usr/app/oracle/product/8 .1.6/bin/s qlldr',
'execute');
dbms_java.grant_permission
( USER,
'java.lang.RuntimePermissi on',
'*',
'writeFileDescriptor' );
end;
/
PL/SQL successful.
Next, binding layer:
sqlplus>create or replace
function run_cmd( p_cmd in varchar2) return number
as
language java
name 'Util.RunThis(java.lang.St ring[]) return integer';
/
Function created.
sqlplus> create or replace procedure rc(
as
x number;
begin
x := run_cmd(p_cmd);
if ( x <> 0 )
then
raise program_error;
end if;
end;
/
Procedure created.
... runs as a proce like this:
sqlplus> set serveroutput on size 1000000
sqlplus> exec dbms_java.set_output(10000 00)
slplus> exec rc('sqlldr userid=scott/tiger control=load.ctl ... etc')
the end.
Eric
"I have a Java app that continuosly inserts rows into a single table. I need to mimick the SQL loader
conventional path load functionality in the following manner:
1. Use bulk inserts in a single call (just to speed it up)
2. Put all rejected rows into separate "BAD" file
I do not know how to insert rows
in batches and get back info which rows were rejected to put them into bad file. Is there any method
of batching inserts and getting back Array of error-codes or something similar through JDBC?"
**************************
SQL*LOADER is Oracle's high speed, bulk data loader, a command line program, not an API, cannot be called from a stored procedure.
It operates in two different modes:
a) Conventional Path - employs SQL INSERTS to load data.
b) Direct Path - does NOT use SQL, formats database blocks directly
The Direct Path load allows you to read data from a flat file, and write it directly to formatted database blocks, bypassing the entire SQL engine (and rollback and redo at the same time). When used in parallel, direct path load is the fastest known way to load data to Oracle.
You have 2 choices:
A. OCI allows you to write your own direct path loader, a mini-SQLLDR in JAVA.
B. Use JAVA to run SQLLDR from the Operating System via an external proecure.
option B is easy, so here it is:
create or replace and compile java source named "Util"
as
import java.io.*;
import java.lang.*;
public class Util extends Object
{
public static int RunThis(String[] args)
{
Runtime rt = Runtime.getRuntime();
int rc = -1;
try
{
process p = rt.exec(args[0]);
int bufSize = 4096;
BufferedInputStream bis =
new BufferedInputStream(p.getI
int len;
byte buffer[] = new byte[bufSize];
// Echo back program output
while ((len = bis.read(buffer, 0, bufSize)) != -1)
System.out.write(buffer, 0, len);
rc = p.waitFor();
}
catch (Exception e)
{
e.printStackTrace();
rc = -1;
}
finally
{
return rc;
}
}
}
/
Java Created.
This is setup to run any program, and capture the output to either a trace file on the server, or, if DBMS_JAVA is used, to the DBMS_OUTPUT buffer.
Any command can be run as the "Oracle" software account using this, were it not for the privileges needed (J2ES). So grant permissions:
sqlplus>begin
dbms_java.grant_permission
( USER,
'java.io.FilePermission',
'/usr/app/oracle/product/8
'execute');
dbms_java.grant_permission
( USER,
'java.lang.RuntimePermissi
'*',
'writeFileDescriptor' );
end;
/
PL/SQL successful.
Next, binding layer:
sqlplus>create or replace
function run_cmd( p_cmd in varchar2) return number
as
language java
name 'Util.RunThis(java.lang.St
/
Function created.
sqlplus> create or replace procedure rc(
as
x number;
begin
x := run_cmd(p_cmd);
if ( x <> 0 )
then
raise program_error;
end if;
end;
/
Procedure created.
... runs as a proce like this:
sqlplus> set serveroutput on size 1000000
sqlplus> exec dbms_java.set_output(10000
slplus> exec rc('sqlldr userid=scott/tiger control=load.ctl ... etc')
the end.
Eric
ASKER
Thanks, Eric, but I doubt I'll use that approach. The problem is that my Java program raeds an input file and, based on some data retrieved from database prepares data for inserting into database. I do not want to put that data back to file formatted for SQL Loader cause it means extra disk IO. Instead, I want something like:
PeparedStatement ps = conn.prepareCall("Insert ? into Table")
ps.setBatch(20);
now start inserts in a loop
ps.setInt(1);
ps.execute();
Now statements will actually be sent in 20-insert batches, BUT if one of those inserts in batch fails I'll have no info which one was rejected, and I'll have to rollback all 20 records :( I tested Oracle batching and it indeed did speed up things by approx 50% when there we no insert errors
On the other hand SQLLoader somehow knows which single insert did cause error, so I hope that there must be a way to achieve similar behaviour via JDBC too. I have never used C\C++, but I think I have seen some examples where one binds arrays in embedded SQL statements like
EXEC SQL insert :1 into table
and :1 there is an array. If so, how one handles errors, per insert, or per array?
PeparedStatement ps = conn.prepareCall("Insert ? into Table")
ps.setBatch(20);
now start inserts in a loop
ps.setInt(1);
ps.execute();
Now statements will actually be sent in 20-insert batches, BUT if one of those inserts in batch fails I'll have no info which one was rejected, and I'll have to rollback all 20 records :( I tested Oracle batching and it indeed did speed up things by approx 50% when there we no insert errors
On the other hand SQLLoader somehow knows which single insert did cause error, so I hope that there must be a way to achieve similar behaviour via JDBC too. I have never used C\C++, but I think I have seen some examples where one binds arrays in embedded SQL statements like
EXEC SQL insert :1 into table
and :1 there is an array. If so, how one handles errors, per insert, or per array?
ASKER
Sorry 'PeparedStatement' should read OraclePeparedStatement' - :)
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Points refunded and question moved to the PAQ, per Koka's request.
https://www.experts-exchange.com/jsp/qShown.jsp?ta=commspt&qid=20257380
Netminder
Community Support Moderator
Experts Exchange
https://www.experts-exchange.com/jsp/qShown.jsp?ta=commspt&qid=20257380
Netminder
Community Support Moderator
Experts Exchange
You should be using a PreparedStatement to insert.
use the PreparedStatement.setXXX()
put data from the file into the statement.
Then you just call executeUpdate() on the PreparedStatement, and loop to the next row.
Don't close or re-create the statement within the loop.
Close it at the end.
Use try {} catch(SQLException e)
to catch any errors, and echo the original input line you had out to your BAD file writer.