?
Solved

How to improve performance of bulk inserts in Hibernate and Spring

Posted on 2009-02-16
3
Medium Priority
?
3,864 Views
Last Modified: 2013-11-24
Hello Experts,

I have an import that needs to import a colon delimited file into a database and then cache calculations into the database. At first the imports of the text files were extremely slow. I then found an the hibernate article on setting batch sizes.
I have set my JDBC batch size to 50 after a lot of benchmarking with this number.
<prop key="hibernate.jdbc.batch_size">50</prop>
I then only flush and clear every 2000 records, which results in a multi-second per minute increase in performance.

The performance issue I am running into still is that I have to query tables using a multiple join query to get count information and then cache that in a table for real time access. These cached values add up to millions of records and requires a query and then an insert for each record created. I've tweaked my performance to 100,000 every 5 minutes.

My question is, are there any other options to increase speed. Would splitting my process into multiple threads make it faster somehow? I really need to be about 10x faster.

Thanks,

Jared
//Query for this by looking for monitor/countType/component AND severity/foundIn is null
                sql = new StringBuilder();
                sql.append("from WeeklyBugChange change where change.weekDate = :date");
                sql.append(" and change.countType = :countType");
                sql.append(" and change.component = :entity");
                sql.append(" and change.bugHistory.bug IN " +
                               "(select bug from Bug bug JOIN bug.bugMonitorHistory monHist where monHist.monitor = :monitor" +
                                   " and monHist.startDate <= :date and (monHist.endDate >= :date OR monHist.endDate IS NULL))");
 
                int size = hibernateTemplate.findByNamedParam(sql.toString(),
                                                     new String[] {"date", "monitor",
                                                                    "countType", "entity"},
                                                     new Object[] {weekDate, monitor, countType,
                                                                    entity}).size();
 
 
                rollup = new WeeklyRollup(weekDate, monitor, countType, size);
                rollup.setComponent(entity);
                getHibernateTemplate().save(rollup);
                recordsCached++;
                if(recordsCached % 2000 == 0){
                    getHibernateTemplate().flush();
                    getHibernateTemplate().clear();
                }

Open in new window

0
Comment
Question by:intlgd
3 Comments
 
LVL 10

Accepted Solution

by:
rajesh_bala earned 750 total points
ID: 23655195
My suspicion is that you might be spending more time in "IN" and "NULL" values in your query.

If you are using "IS NULL", then some database does not have any other option, other than to scan the entire table. If you are using Oracle, you might try something like this. Following is just an example. You have to use something similiar as per your needs.

create index
    emp_null_ename_idx
on
   emp
   (nvl(ename,'null'));
 
analyze index emp_null_ename_idx compute statistics;

select /*+ index(emp_null_ename_idx) */
   ename
from
   emp e
where
  nvl(ename,'null') = 'null'

I would also try to check if "IN" can be replaced with "EXISTS" operator.
0
 
LVL 26

Assisted Solution

by:Tomas Helgi Johannsson
Tomas Helgi Johannsson earned 750 total points
ID: 23657625
     Hi!

Make sure that your tables have indexes which have the columns of your
sql where clauses.
E.g. your bugMonitorHistory table should have this index (bug,monitor, startdate, enddate).
and your WeeklyBugChange table should have the index (weekdate, entity, counttype).

Take your SQL query and do an explain on it to see if there are any tablescans which is not
what you want. Indexscan is more preferable.

Regards,
   Tomas Helgi
0
 

Author Closing Comment

by:intlgd
ID: 31547851
I was hoping to get some information on threading to boost performance, but I think these pointers are likely just as impactful. Thanks Experts!!!
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I had a project requirement for a displaying a user workbench .This workbench would consist multiple data grids .In each grid the user will be able to see a large number of data. These data grids should allow the user to 1. Sort 2. Export the …
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
The viewer will learn how to implement Singleton Design Pattern in Java.
This tutorial covers a practical example of lazy loading technique and early loading technique in a Singleton Design Pattern.

830 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question