• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 715
  • Last Modified:

Java Design Question: Dealing with very large database table efficiently

Hello.

I need to implment a new application that reads and publishes the contents of 1 very large database table on start up. The application will be a standalone app, and when it starts, it must query a database, and publish 20 million rows to a downstream application. The application I am writing is not required to cache the rows, it will just take each database record, convert it to a new format, and then publish it to the network. It needs to get through the 20 million rows as quick as possible.

I would be particularly interested in any open source solutions that may exist.

Thanks in advance,

Ronan
0
RonanL
Asked:
RonanL
1 Solution
 
qasitouchCommented:
NO, you don't need to send 20 milion records at once.

Use the idea of pagination and fetch fewer records in one requst like 100, 50 or 10.


For quering, u can use sql-query and  fetch the only limited data from server.

sample query:

SELECT * FROM `student` WHERE mark BETWEEN 60 and 75
0
 
RonanLAuthor Commented:
Thanks for the reply. I think the idea of fetching a smaller number of records at a time went without saying. What I'm looking for is an elegant, preferably multi threaded, solution, that will help me get through the 20 million rows as quickly as possible. If I can't get something off the shelf, I'll code it myself.
0
 
rbrindlCommented:
You can use the jdbc fetch size feature for getting the data. This effectively bulks the data transfer from the database.
In your java code you then always read chunks of this bulksize in a separate SQL reader thread.
This creates simple objects from the ResultSet and gives back a List to the main thread, which distributes the list to worker threads that produce the output.
The SQL reader threads fetches the next bulk of data as soon as it gave the current one back to the main thread (it could read more, depending on the complexity of producing the output).
This way, you should never have to wait for the database very long (again this depends on the complexity of the output), at least you can fetch data concurrently to producing output, and you can control memory consumption with the prefetch (bulk) size.
Please tell me, if you need a short sample pseudo code.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
RonanLAuthor Commented:
Hey rbrindl.

What you describe sounds like it might be well suited to what we are trying to achieve. If you could provide anything more to help get me started that would be great. A link to a good example, or even some documentation I could read would do the trick!

Thanks,

Ronan
0
 
rbrindlCommented:
I dont know if there are examples for this use case. Basically all this is standard java and documented by Sun. If you give me some time, i can strip down a testcase i once wrote, which basically does what you need.
0
 
RonanLAuthor Commented:
I've done some digging around on the sun site, and I just can't see the api's you're referring to, and how they can be used in the manner you describe. I don't want you to go to too much effort, but if you can point me in the right direction, that would be great.

Thanks,

Ronan
0
 
rbrindlCommented:
I will get back to you this evening (Central European Time)
0
 
RonanLAuthor Commented:
Great! Thanks rbrindl
0
 
sarthy2008Commented:
Since you no need to cache the data and it is a stand alone application and you want to do it in the start-up of the application, following logic may help you to make quick solution.
Based on your scenario, you need to do the exception handling

1) Execute the SQL statement
2) While ResultSet has more records
   do
      2a) Read the record
      2b) Convert to the new format
     2c) Write the data to the network  
   end do

3) Close the resultset
4) close the statement
5) Close the connection
0
 
rbrindlCommented:
Sorry i did not get back, i was quite busy! but as i see, you accepted the answer, do you need further assistence?
0
 
RonanLAuthor Commented:
No problem rbrindl. I don't need any further assistance at the moment.

Thanks for you help.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now