asked on

Handling large records in a Java EE application

Hi,

I would like to know which technique and technology one should use in order to retrieve a millions of record from the database table, whether pagination is the better approach or any other approach can help me ?

Thank you

Am P

I would suggest that pagination is a better approach.

Think about it, you are retrieving millions of record and you are converting it into a java object (or say ResultSet), how much memory will it need to store the entire content ? Somewhere, i done know when, it will definitely end-up with OutOfMemoryError.

But, lets wait for some time. I am sure the other experts will have better solution than above.

roy_sanu

ASKER

Thanks Amit for good info, I will waits other expert comments is there any way i can think up on
spring jdbctemplate which hold this object efficently or on database side i.e stored procedure or cursor may help me on that. Let us know other experts....

MogalManic

It all depends on how you envision the users will view or access the data. If they are only going to be viewing a small subset of the records, then pagination would be the best solution.

If the user is retrieving the whole record set and viewing/accessing the results (for example the record set constitutes a file), then some other sort of pattern. For example you could use a datareader and buffer the results to the file (or Response)

Gary Patterson, CISSP

You might get more answers if you explained your requirement in more detail. There are a variety of ways to process large data sets, and the best technique often depends on what you plan to do with the data.

dpearson

The main thing I'd suggest is to not do:

a) Read all data from the database into memory
b) Process the data

The simplest alternative is to read one record at a time, process it and then proceed to the next record.

If you're using JDBC, you can use the db-utils library from Apache commons:
http://commons.apache.org/proper/commons-dbutils/examples.html

which shows how to implement a "ResultSetHandler" which is called for each row in the ResultSet.

You can implement your processing there and avoid sending the entire result set back to the client app as a single chunk of data which you then process.

Another option is to do the processing inside the database itself (with a stored procedure). It really depends on what the "processing" is that you need to do.

Hope that helps,

Doug

roy_sanu

ASKER

What I had is the customer and the product table which has I million records transaction as a master detail relation , what I need to display on the ui layer to display 100 records on each page.

mrcoffee365

You didn't get more answers because the first response was correct. It's a waste of processing time and space to get a million records for each page request. The most common method is to get 100 records at a time with the sql query.

MogalManic

So yes, the 1st response is correct. You should use a datareader to grab the 100 records based on the user's current page.

The algorithm is something like this:
1. (optional) - Run a "count" query to get the total # of records (required if you want to compute the number of pages)
2. Run the query and return the results in a ResultSet:

ResultSet rs = st.executeQuery("SELECT * FROM mytable");

Open in new window

3. Skip the required number of pages by calling rs.next() until you get to the required row
4. Process the next 100 rows
5. Close the ResultSet object

The above algorithm will work with most databases. For a database specific way of doing the same thing, you can push the logic into the query itself. For example in MySQL you can do something like this:

SELECT * FROM tbl LIMIT 5,10;  # Retrieve rows 6-15

Open in new window

which means starting at row 6, retrieve the next 10 rows.

roy_sanu

ASKER

Thanks for the info, what your thought on using stored procedure and calling using callable statement,
will that be faster retrieval, well i will be using jsp as a front end...

ASKER CERTIFIED SOLUTION

MogalManic

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial