We help IT Professionals succeed at work.

Bulk inserts via JDBC

Koka
Koka asked
on
I have a Java app that continuosly inserts rows into a single table. I need to mimick the SQL loader conventional path load functionality in the following manner:
1. Use bulk inserts in a single call (just to speed it up)
2. Put all rejected rows into separate "BAD" file

When I insert them one-by-one all is fine and I have all set up, but I do not know how to insert rows in batches and get back info which rows were rejected to put them into bad file. Is there any method of batching inserts and getting back Array of error-codes or something similar through JDBC?
Thanks
Comment
Watch Question

Commented:
Don't use Oracle's batches.  They're implemented for compatibility, but they're not really any faster than submitting statements one by one.

You should be using a PreparedStatement to insert.
use the PreparedStatement.setXXX() methods to
put data from the file into the statement.

Then you just call executeUpdate() on the PreparedStatement, and loop to the next row.
Don't close or re-create the statement within the loop.
Close it at the end.

Use try {} catch(SQLException e)
to catch any errors, and echo the original input line you had out to your BAD file writer.




Author

Commented:
At the moment I'm executing CallableStatement (the called PL|SQL procedure actually consists only of insert statement) in a loop, with a try-commit catch-rollback blocks, closing statement/connection in final block . In fact perfoming commits,say, after 100 rows does not improve performance. But in this case, in the full analogy to your proposal data for each insert is passed to server in a single network call for EACH row, and I hope to improve performance with passing data for 100 rows in a single network call (e.g. pass array(100) of something and insert those 100 rows (bulk bind?) in a procedure that returns errors(100) array).

I just wonder what is the best or recommended design, if there is such. Thanks

Commented:
Oh.  So you'll be making some sort of PL/SQL block with
text like:

BEGIN
INSERT INTO M ( c1, c2 ) VALUES ( 'v1','v2');
INSERT INTO M ( c1, c2 ) VALUES ( 'v3','v4');
INSERT INTO M ( c1, c2 ) VALUES ( 'v5','v6');
INSERT INTO M ( c1, c2 ) VALUES ( 'v7','v8');
END;

?

Interesting idea.  I have no idea if it will actually be any faster.  Maybe it depends on if your network is slow.

Author

Commented:
oh no, if I pass arguments as arrays I can use
FORALL statement to use bulk binds in my procedure. That definitely will be faster, but it seems that proceeding this way is nearly equivalent to custom batch insert implementation through JAVA, and it seems unbelievable that Oracle has not implemented batch iserts.
And if Oracle has not done it already, that means there's some caveat in such approach. Or is there? That is the question :)

If not the mess with necessity of returning that varay containing error codes I'd try it w/o asking here, but you know, I'm a bit lazy :), so I'm waiting for someone to show me better or easier design for inserting tens of thousands calls into a busy database through JDBC.
I'm increasing points to 300, lazy people pay more :)

Author

Commented:
lazy people SHOULD pay more

Commented:
I'm not familiar with the FORALL pl/sql stuff.  PreparedStatements with re-binding is how I'd approach the problem.  Or, I'd beg my boss to go back to using SQLLoader, which really is a cool program ;-)

Author

Commented:
Yes, SQLLoader is cool - does your 'PreparedStatements with re-binding' perform like loader conv path? I'm almost sure it is noticably slower :).

That's why I'm sure there is a way to achieve similar performance - I do not want to write generated records to file and run loader through shell, extra disk IO is always costy.
hi,

"I have a Java app that continuosly inserts rows into a single table. I need to mimick the SQL loader
conventional path load functionality in the following manner:
1. Use bulk inserts in a single call (just to speed it up)
2. Put all rejected rows into separate "BAD" file

I do not know how to insert rows
in batches and get back info which rows were rejected to put them into bad file. Is there any method
of batching inserts and getting back Array of error-codes or something similar through JDBC?"
**********************************************************

SQL*LOADER is Oracle's high speed, bulk data loader, a command line program, not an API, cannot be called from a stored procedure.

It operates in two different modes:
a) Conventional Path - employs SQL INSERTS to load data.
b) Direct Path - does NOT use SQL, formats database blocks directly

The Direct Path load allows you to read data from a flat file, and write it directly to formatted database blocks, bypassing the entire SQL engine (and rollback and redo at the same time).  When used in parallel, direct path load is the fastest known way to load data to Oracle.

You have 2 choices:

A. OCI allows you to write your own direct path loader, a mini-SQLLDR in JAVA.

B. Use JAVA to run SQLLDR from the Operating System via an external proecure.

option B is easy, so here it is:

create or replace and compile java source named "Util"
as
import java.io.*;
import java.lang.*;

public class Util extends Object
{
 public static int RunThis(String[] args)
 {
 Runtime rt = Runtime.getRuntime();
 int     rc = -1;
 
 try
 {
  process p = rt.exec(args[0]);
     int bufSize = 4096;
     BufferedInputStream bis =
      new BufferedInputStream(p.getInputStream(), bufSize);
     int len;
     byte buffer[] = new byte[bufSize];
// Echo back program output
     while ((len = bis.read(buffer, 0, bufSize)) != -1)
       System.out.write(buffer, 0, len);
     rc = p.waitFor();
 }
 catch (Exception e)
 {
   e.printStackTrace();
   rc = -1;
 }
 finally
 {
   return rc;
 }
 }
}
/
Java Created.

This is setup to run any program, and capture the output to either a trace file on the server, or, if DBMS_JAVA is used, to the DBMS_OUTPUT buffer.

Any command can be run as the "Oracle" software account using this, were it not for the privileges needed (J2ES).  So grant permissions:

sqlplus>begin
dbms_java.grant_permission
 ( USER,
 'java.io.FilePermission',
 '/usr/app/oracle/product/8.1.6/bin/sqlldr',
 'execute');

dbms_java.grant_permission
 ( USER,
 'java.lang.RuntimePermission',
 '*',
 'writeFileDescriptor' );
end;
/

PL/SQL successful.

Next, binding layer:

sqlplus>create or replace
function run_cmd( p_cmd in varchar2) return number
as
language java
name 'Util.RunThis(java.lang.String[]) return integer';
/

Function created.

sqlplus> create or replace procedure rc(
as
 x number;
begin
 x := run_cmd(p_cmd);
 if ( x <> 0 )
 then
    raise program_error;
 end if;
end;
/
Procedure created.

... runs as a proce like this:

sqlplus> set serveroutput on size 1000000
sqlplus> exec dbms_java.set_output(1000000)
slplus> exec rc('sqlldr userid=scott/tiger control=load.ctl ... etc')

the end.
Eric


 

Author

Commented:
Thanks, Eric, but I doubt I'll use that approach. The problem is that my Java program raeds an input file and, based on some data retrieved from database prepares data for inserting into database. I do not want to put that data back to file formatted for SQL Loader cause it means extra disk IO. Instead, I want something like:

PeparedStatement ps = conn.prepareCall("Insert ? into Table")

ps.setBatch(20);

now start inserts in a loop
ps.setInt(1);
ps.execute();

Now statements will actually be sent in 20-insert batches, BUT  if one of those inserts in batch fails I'll have no info which one was rejected, and I'll have to rollback all 20 records :( I tested Oracle batching and it indeed did speed up things by approx 50% when there we no insert errors

On the other hand SQLLoader somehow knows which single insert did cause error, so I hope that there must be a way to achieve similar behaviour via JDBC too. I have never used C\C++, but I think I have seen some examples where one binds arrays in embedded SQL statements like
EXEC SQL insert :1 into table
and :1 there is an array. If so, how one handles errors, per insert, or per array?

Author

Commented:
Sorry 'PeparedStatement' should read OraclePeparedStatement' - :)
The questioner has asked that, since there is no apparent solution, his points be refunded and this question moved to the PAQ.

I have locked this question to preclude further comments except from the Experts involved; if no substantial objection is received within three (3) days, I will return and close the question accordingly.

Netminder
Community Support Moderator
Experts Exchange
Points refunded and question moved to the PAQ, per Koka's request.
http://www.experts-exchange.com/jsp/qShown.jsp?ta=commspt&qid=20257380

Netminder
Community Support Moderator
Experts Exchange