Improving the performance of a simple JDBC project

Hello,

I have a simple JDBC class that happens to query a rather large table (~ 1 million rows) and has a considerable number of columns (~ 300+).

I'm trying to see if it is possible to improve the performance of this code, by either:
Pooling the connection somehow so we can have like 10 or so queries running against the same table?
Removing the looping necessary to get all the columns from the ResultSet object; I noticed a considerable improvement in runtime when I manually unrolled this (over 50% improvement)

I'm hoping there's a dynamic way to find the number of columns once, and use something like:

     rs.getColumn(1) + "|" + rs.getColumn(2) + "|" + ... + rs.getColumn(x)

The most "straightforward and simple" way I can think of doing its is having a Python script generate and execute a dynamically generated Java file. Now, that doesn't sound very straightforward to me... so I'm hoping there's something new for me to learn in Java.

Appreciate any help on this!
Errang GenevreAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

CEHJCommented:
I'm hoping there's a dynamic way to find the number of columns once
There is - from  ResultSetMetaData.getColumnCount. But why should the number of columns be subject to change?

The best way to optimize queries in general is proper indexing, but we need to know more about how you're consuming and using the data
0
Errang GenevreAuthor Commented:
There is - from  ResultSetMetaData.getColumnCount. But why should the number of columns be subject to change?

No, that's not what I meant... I know how to get the column count.

What I want to do, is avoid this loop...

String temp = "";
for (int i = 0; i < columnCount; i++) {
    if (temp.length() == 0) {
        temp = rs.getColumn(i);
    } else {
        temp = temp + "|" + rs.getColumn(i);
    }

Open in new window


for every row.

Is there a macro of some sort I can use in Java so that I only need to get the expression once?
0
dpearsonCommented:
Removing the looping necessary to get all the columns from the ResultSet object; I noticed a considerable improvement in runtime when I manually unrolled this (over 50% improvement)


This seems very unlikely to me.  I would wonder if what you did was:

a) Ran the original code
b) Modified the code to unroll the loop
c) Ran the modified code and noticed it was faster

But really what happened was that between (a) and (c) was that the database cached the query, so it responded much faster than on the first call.

Why would that seem more likely?

Because the time it takes to call to a database is hugely (many orders of magnitude) longer than it should take to run through a loop to get the data from the response.  Think of what is happening when you make that call to the database?  First, you need to get a connection to the database, then send the query over that, then the database parses it, analyzes the query, scans the table, collects up the data from disk, encodes that data for transmission, sends it over the socket where it is decoded by the client (the Java) app.

Then you run a simple for loop to extract it.

The for loop shouldn't be a factor in the timing.

Your first idea (using multiple connections to each get a piece of the data) might help and is possible.  You'd just break up the query into sections.  It might also make it run slower overall, but it would certainly be doing different work and might boost things.

Unrolling the loop should make no measurable difference.

Just my opinion :)

Doug
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

mccarlIT Business Systems Analyst / Software DeveloperCommented:
Unrolling the loop should make no measurable difference.

Just my opinion :)

Not *just* yours Doug, that would be my opinion too!!

Also, what's the nature of your SQL, are you pulling back all those 1M rows and 300 cols of each? Or some smaller subset of rows/cols?
0
Errang GenevreAuthor Commented:
Hm... Now that you mention it, I did run the original query first; but I didn't let it complete; cancelled it after 30 minutes.

Then I tried unrolling the loop which sped it up.

Then, I removed the loop unrolling optimization and ran the query; that's how I measured the time.

Pulling all record and all columns.
0
CEHJCommented:
What I want to do, is avoid this loop...

That part of your code with string concatenation is odd - what are you doing there and why?

Using a cursor (which is essentially and generally what you're doing when you're iterating a ResultSet) is certainly not without overhead. The only way to avoid overhead would be if results were cached in memory.

Having said that, there is no reason you can't refer to the columns by name, but you didn't address my question:
But why should the number of columns be subject to change?

Without seeing more of your code it's not easy to advise
0
Errang GenevreAuthor Commented:
But why should the number of columns be subject to change?

The number of columns doesn't change.

That loop is to concatenate all the columns into one string... Please tell me there is a better way.
0
Errang GenevreAuthor Commented:
But I will be running lots of queries, so I don't want to hard code anything.

The number of columns obviously changes from query to query.
0
CEHJCommented:
That loop is to concatenate all the columns into one string... Please tell me there is a better way.
There probably is but since it seems difficult for you to give details about what you're doing, i'm not going to speculate. Nor am i going to comment further unless you give much more information.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Errang GenevreAuthor Commented:
Yea, unfortunately for this question all I can say is that there will be about a million rows and 300 columns; all text/varchar fields.
0
Errang GenevreAuthor Commented:
But I do understand it's frustrating working without all the information; but I'd get fired if I give more information than in allowed to... Lol.

Thanks for the help so far.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Java

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.