Performance Limitations with using JDBC addBatch()

I've yet to find an article that explains possible performance hits when using the addBatch() functionality.  I'm iterating over around a million records and I was wondering if there is a difference between me using addBatch() for all the records, or whether I should call executeBatch() at certain intervals.  Basically I'm wondering if executeBatch is going to somehow free up memory resources.
LVL 1
nixj14Asked:
Who is Participating?
 
Jim CakalicConnect With a Mentor Senior Developer/ArchitectCommented:
You're right. There isn't much in the way of batch size recommendations. This is likely because choosing an appropriate batch size depends, I think, on a number of variables:

- size of the statement. The larger the statement and its associated context (statement text itself and bound parameter values) the greater the memory requirements per statement.

- transaction size. Since autoCommit should be disabled before constructing and executing a batch, all statements in the batch are performed in a single transaction. The larger the transaction, the greater the potential overhead in the database engine and the possibility of running out of transaction log space in the middle.

- concurrency. Directly related to transaction size. If you're performing a large number of inserts say on a single table then there will be a large number of row locks held against the table for the life of the transaction. Many database engines automatically upgrade row locks to table locks at some threshold. If this happens, then other writers to the table and possibly readers will be queued while the transaction completes.

- jdbc driver. This always has a large effect on performance.

- database vendor. Specifically with respect to transaction handling.

Ultimately, an optimal batch size needs to take into consideration these variables which are related to your application and its execution evironment. Oracle tends to recommend batch sizes between 5 and 30 statements. In testing that I've done, this seems to be a good recommendation. The batch size is large enough to provide significant performance gains over single-row inserts (even when autoCommit is disabled) but not so large as to cause concern of the negative side effects mentioned above. This range of batch size conforms to Oracle's own non-standard batching implementation recommendations so I'm fairly comfortable with that. The performance increase isn't necessarily linear as the batch size increases which is also an indicator that application-specific testing should be done.

I would suggest that inserting all 1,000,000 rows in a single batch is likely to be a poor decision regardless of your driver/database. I'd start perhaps with batching perhaps 10 inserts and incrementally increase the size until you are comfortable with performance.

Best regards,
Jim Cakalic
0
 
nixj14Author Commented:
I appreciate your help.  Your numbers were a lot more realistic than I was thinking.  I'll run some tests and see what kind of performance kicks I get.

Thanks
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.