Solved

Performance Limitations with using JDBC addBatch()

Posted on 2002-05-31
2
869 Views
Last Modified: 2007-12-19
I've yet to find an article that explains possible performance hits when using the addBatch() functionality.  I'm iterating over around a million records and I was wondering if there is a difference between me using addBatch() for all the records, or whether I should call executeBatch() at certain intervals.  Basically I'm wondering if executeBatch is going to somehow free up memory resources.
0
Comment
Question by:nixj14
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
2 Comments
 
LVL 19

Accepted Solution

by:
Jim Cakalic earned 50 total points
ID: 7047407
You're right. There isn't much in the way of batch size recommendations. This is likely because choosing an appropriate batch size depends, I think, on a number of variables:

- size of the statement. The larger the statement and its associated context (statement text itself and bound parameter values) the greater the memory requirements per statement.

- transaction size. Since autoCommit should be disabled before constructing and executing a batch, all statements in the batch are performed in a single transaction. The larger the transaction, the greater the potential overhead in the database engine and the possibility of running out of transaction log space in the middle.

- concurrency. Directly related to transaction size. If you're performing a large number of inserts say on a single table then there will be a large number of row locks held against the table for the life of the transaction. Many database engines automatically upgrade row locks to table locks at some threshold. If this happens, then other writers to the table and possibly readers will be queued while the transaction completes.

- jdbc driver. This always has a large effect on performance.

- database vendor. Specifically with respect to transaction handling.

Ultimately, an optimal batch size needs to take into consideration these variables which are related to your application and its execution evironment. Oracle tends to recommend batch sizes between 5 and 30 statements. In testing that I've done, this seems to be a good recommendation. The batch size is large enough to provide significant performance gains over single-row inserts (even when autoCommit is disabled) but not so large as to cause concern of the negative side effects mentioned above. This range of batch size conforms to Oracle's own non-standard batching implementation recommendations so I'm fairly comfortable with that. The performance increase isn't necessarily linear as the batch size increases which is also an indicator that application-specific testing should be done.

I would suggest that inserting all 1,000,000 rows in a single batch is likely to be a poor decision regardless of your driver/database. I'd start perhaps with batching perhaps 10 inserts and incrementally increase the size until you are comfortable with performance.

Best regards,
Jim Cakalic
0
 
LVL 1

Author Comment

by:nixj14
ID: 7047559
I appreciate your help.  Your numbers were a lot more realistic than I was thinking.  I'll run some tests and see what kind of performance kicks I get.

Thanks
0

Featured Post

Creating Instructional Tutorials  

For Any Use & On Any Platform

Contextual Guidance at the moment of need helps your employees/users adopt software o& achieve even the most complex tasks instantly. Boost knowledge retention, software adoption & employee engagement with easy solution.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Maven Project: Hibernate Dependencies Conflict 10 82
Which non-HTML GUI front end to use with Java? 3 63
Is there a simpler dropbox system? 10 50
Java pass by reference 3 68
By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Go is an acronym of golang, is a programming language developed Google in 2007. Go is a new language that is mostly in the C family, with significant input from Pascal/Modula/Oberon family. Hence Go arisen as low-level language with fast compilation…
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.
Suggested Courses

742 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question