How to improve performance of bulk inserts in Hibernate and Spring

Hello Experts,

I have an import that needs to import a colon delimited file into a database and then cache calculations into the database. At first the imports of the text files were extremely slow. I then found an the hibernate article on setting batch sizes.
I have set my JDBC batch size to 50 after a lot of benchmarking with this number.
<prop key="hibernate.jdbc.batch_size">50</prop>
I then only flush and clear every 2000 records, which results in a multi-second per minute increase in performance.

The performance issue I am running into still is that I have to query tables using a multiple join query to get count information and then cache that in a table for real time access. These cached values add up to millions of records and requires a query and then an insert for each record created. I've tweaked my performance to 100,000 every 5 minutes.

My question is, are there any other options to increase speed. Would splitting my process into multiple threads make it faster somehow? I really need to be about 10x faster.


//Query for this by looking for monitor/countType/component AND severity/foundIn is null
                sql = new StringBuilder();
                sql.append("from WeeklyBugChange change where change.weekDate = :date");
                sql.append(" and change.countType = :countType");
                sql.append(" and change.component = :entity");
                sql.append(" and change.bugHistory.bug IN " +
                               "(select bug from Bug bug JOIN bug.bugMonitorHistory monHist where monHist.monitor = :monitor" +
                                   " and monHist.startDate <= :date and (monHist.endDate >= :date OR monHist.endDate IS NULL))");
                int size = hibernateTemplate.findByNamedParam(sql.toString(),
                                                     new String[] {"date", "monitor",
                                                                    "countType", "entity"},
                                                     new Object[] {weekDate, monitor, countType,
                rollup = new WeeklyRollup(weekDate, monitor, countType, size);
                if(recordsCached % 2000 == 0){

Open in new window

Who is Participating?
My suspicion is that you might be spending more time in "IN" and "NULL" values in your query.

If you are using "IS NULL", then some database does not have any other option, other than to scan the entire table. If you are using Oracle, you might try something like this. Following is just an example. You have to use something similiar as per your needs.

create index
analyze index emp_null_ename_idx compute statistics;

select /*+ index(emp_null_ename_idx) */
   emp e
  nvl(ename,'null') = 'null'

I would also try to check if "IN" can be replaced with "EXISTS" operator.
Tomas Helgi JohannssonCommented:

Make sure that your tables have indexes which have the columns of your
sql where clauses.
E.g. your bugMonitorHistory table should have this index (bug,monitor, startdate, enddate).
and your WeeklyBugChange table should have the index (weekdate, entity, counttype).

Take your SQL query and do an explain on it to see if there are any tablescans which is not
what you want. Indexscan is more preferable.

   Tomas Helgi
intlgdAuthor Commented:
I was hoping to get some information on threading to boost performance, but I think these pointers are likely just as impactful. Thanks Experts!!!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.