asked on

Java-JDBC Statement Vs PreparedStatement

Hi,
I'm using Java JDBC Statement to do select/insert/update data.
All my stmt.executeQuery()/stmt.executeUpdate() are executed
only once. (not in a loop) I have around 2000 of these Statements
in my application.Its a web based project and deployed into IBM WebSphere.

From the readbility point of view I hate my half page SQL string statement
and also lot of String concatenation or stringbuffer.append().
I understand that using PreparedStatement will decrease performance
since all my SQL statements are executed only once. But it gives the
elegant look and no string/stringbuffer operations. ok

Can I solve the issue of readability and avoiding String concatenation by
using PreparedStatement but without loosing the performance.
Is there anywhere (jdbc or driver or database) I can set something (some flag or variable)
so that I can use preparedstatement (thus good readbility, no more string operations)
and still maintaining same performance as Statement ???

Thanks in advance.

ASKER CERTIFIED SOLUTION

Mick Barry

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

CEHJ

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Mick Barry

i would not expect any significant decrease in performance in your case.

CEHJ

Certainly i'd do some testing. It may not be worth losing the PS

SOLUTION

rajesh_bala

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

grim_toaster

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

SOLUTION

Mayank S

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Mirinda

read this study
http://www.onjava.com/pub/a/onjava/excerpt/oraclejdbc_19/index.html

grim_toaster

Mirinda, there are several flaws with that link. Firstly, it is dated, the book from which it is taken from was written in 2001, and does not specify which JDK is being used, but there is a hint that Oracle 8.1.6 is used as the database, which is very dated (no offence if anyone is still using it! ;) ).

But the most fundamental problem with their benchmarks, is that they state they are testing certain functionality, so they change a flag (for example stmt.setEscapeProcessing(false)), and then proceed to change other parts of the method. Here's their downloaded code snippets:

Escape processing enabled:

Statement stmt = conn.createStatement();
SimpleDateFormat sdf = new SimpleDateFormat( "yyyy-MM-dd HH:mm:ss" );
conn.setAutoCommit(false);
startTime = System.currentTimeMillis();
for (i=1;i<1001;i++) {
stmt.executeUpdate("insert into testxxxperf ( id, code, descr, insert_user, insert_date ) " +
"values ( " + Integer.toString( i ) + ", '123456789012345678901234567890', " +
"'12345678901234567890123456789012345678901234567890123456789012345678901234567890', " +
"USER, {ts '" + sdf.format(new java.util.Date(System.currentTimeMillis())) + "'} )");
}
conn.commit();

Escape processing disabled:

Statement stmt = conn.createStatement();
stmt.setEscapeProcessing(false);
SimpleDateFormat sdf = new SimpleDateFormat( "yyyyMMddHHmmss" );
conn.setAutoCommit(false);
startTime = System.currentTimeMillis();
for (i=1;i<1001;i++) {
stmt.executeUpdate("insert into testxxxperf ( id, code, descr, insert_user, insert_date ) " +
"values ( " + Integer.toString( i ) + ", '123456789012345678901234567890', " +
"'12345678901234567890123456789012345678901234567890123456789012345678901234567890', " +
"USER, to_date('" + sdf.format(new java.util.Date(System.currentTimeMillis())) + "', 'YYYYMMDDHH24MISS'))");
}
conn.commit();

Note how the date format is different. I've run these same tests and changing the test makes the version the author is trying to recommend appear better than it is! This is no true benchmark!

One other major fault is that the database schemas change between each script, one test that just calls testxxxperf, and another uses oehr.testxxxperf, quite blatently a different schema! Who knows how differently that has been set up? Or how much data already exists in the table?

If you run all of the tests, using the correct set up, and the same schema, and using JDK1.4.2_04 and Oracle 9i (I admit that I cannot test this on older versions), then what the author has stated as fact is completely wrong! The performance would have undoubtedly improved with each release since then though!

If it wasn't such an old entry I'd kick up a fuss!

rameshaa

ASKER

Hi ' rajesh_bala':
Its not a batch process. Depends upon the user operation 1 or more queries are executed.
So its not batch of 2000 queries executed at the same time.
Whenever user does some operation I'll get connection from pool, create statement,
do execute(), close statement, release connection.

Read this, http://www.ideas2work.com/jdbc1.html
Time taken to create Statement, PreparedStatement are quite different.

Thanks for the input.

CEHJ

>>But the most fundamental problem with their benchmarks...

Well done grim! Would be interesting to get some stats on your particular sql rameshaa

Tommy Braas

Here are my $0.02.

If a query/update statement is to be executed rarely, a regular Statement will be fine. If a query/update statement is to be used more than once, a PreparedStatement is to be preferred over a Statement as a PreparedStatement is compiled and cached on the db server. The actual SQL code is compiled and chached (separate from the instance of your PreparedStatement).

The first execution of a particular PreparedStatement statement is a little bit slower than using a Statement, as it gets compiled and cached, but all subsequent calls are faster than using a Statement. Keep in mind that the caching is done on the server. Even when you create a new PreparedStatement with the same query/update statement it will run faster than with Statement.

If you're really looking to get a performance boost, why not use callable statements? You create all your queries/updates on the server, and just call them with the appropriate parameters. Main advantage to callable statements is that when doing performance tuning (even after release), you won't have to change your Java code.

Take the example from the link you provided and try to eceute two Statements, and then two PreparedStatements (with the same query) and then compare the times.

grim_toaster

--> Read this, http://www.ideas2work.com/jdbc1.html
Again, several flaws with the "benchmarking". Each query is only executed once, and taken as the result. However, things can quite easily change, so it is an absolute must to execute the queries a number of times and then take the average. For example, other applications running on the machine can take more resources, or the database may have a backup job running - we really cannot know.

When dealing with results of 20ms or 10ms, if you ran the exact same queries again, this may swap around, or change drastically. Again, take an average of several runs. Also, how many of your queries are "select * from <table>", most queries that are used in applications require some sort of criteria in the where clause to prevent massive amounts of data being returned to the client, in which case the PreparedStatement offers several advantages. As stated in my previous post, the PreparedStatement will use bind variables, and the Statement does not. Provided you are using Oracle, here's the proof:

1) Start off with a clean database cache:
ALTER SYSTEM FLUSH SHARED_POOL;

2) Get all of the parsed SQL, and note that the number should be pretty low on a dedicated server, without any applications running (you may need to be given access to the v$ tables from your dba).
SELECT * FROM v$sqlarea

3) Run the following statement query from your java code:
for (int i = 0; i < 1000; i++) {
stmt = con.createStatement();
stmt.executeQuery("SELECT * FROM emp WHERE empno = " + i);
}

4) Now re-query the parsed SQL
SELECT * FROM v$sqlarea
And you will notice that there are several (1000) parsed queries like:
SELECT * FROM emp WHERE empno = 1
SELECT * FROM emp WHERE empno = 2
...

5) Now reset the database:
ALTER SYSTEM FLUSH SHARED_POOL;

6) And run the same query, but using a PreparedStatement object:
for (int i = 0; i < 1000; i++) {
stmt = con.prepareStatement("SELECT * FROM emp WHERE empno = ?");
stmt.setInt(1, i);
stmt.executeQuery();
}

7) Now re-query the parsed SQL
SELECT * FROM v$sqlarea
And you will see this time that there is only a single entry for our run queries
SELECT * FROM emp WHERE empno = :1

What this means is the database has a single piece of SQL already parsed that will be used, instead of filling up the shared pool with SQL that will probably only ever be called once. This will significantly reduce the resources required on the database side (a very good thing!).

Oh, and as a final note, on the link they state:
--> So it is always advisable to initialize database connection and prepared statement in some initialization code
I would totally disagree. There is never a "you should always do this" when it comes to these sorts of situations (or any for that matter). A database connection is a scarce resource, and if you get a connection and initialise a prepared statement, that code must hold onto that Connection until the end (the de-initialisation section), preventing anything else using it, and if you have many users running that same initialisation of a larger body of code, you could get serious out-of resource errors!

CEHJ

Since this seems to be turning into something of a workshop rameshaa, why not post your code? There are plenty of other ways of optimization too, even if it's something as trivial as not using SELECT *

rameshaa

ASKER

OK,
The code is nothing but regular JDBC stuff. Mostly my queries are atleast half page. (where clause is big)
anyway,

static final String DRIVER_NAME = "COM.ibm.db2.jdbc.app.DB2Driver";
      static final String URL = "jdbc:db2:MyDB";
      static final String USER_NAME = "x";
      static final String USER_PWD = "x";

      public static void execStatement() {
            Connection con = null;Statement st = null;ResultSet rs = null;
            try {
                  Class.forName(DRIVER_NAME);
                  con = DriverManager.getConnection(URL, USER_NAME, USER_PWD);
                  long time1 = System.currentTimeMillis();
                  st = con.createStatement();
                  rs = st.executeQuery("select * from employe_tab where emp_id='1323' " +
"and last_name='karry' and sal > 8000 and comp_id=222 " +
"and dept='A' and cat='mgr' ");
                  long time2 = System.currentTimeMillis() - time1;
                  System.out.println("Time taken by Statement query : "+time2);

                  } catch(Exception ex){
                        ex.printStackTrace();
                  } finally {
                        try { rs.close(); } catch (Exception e) { rs = null; }
                        try { st.close(); } catch (Exception e) { st = null; }
                        try { con.close(); } catch (Exception e) { con = null; }
                  }
      }

Similary I have test code using preparedstatement.
I tested several times (30+ times) and 99% statements executed faster than prepared statement.
The difference is 1 ms to 100 ms between stat and preparedstat.
one more time, as I said, I don't have the same statement executing in a loop.
For each user operation the queries were executed mostly once. So I don't reuse the same statement/preparedstatement.
I agree that If I do it loop the preparedstatement is faster.
So for executing the query once, the statement seems to be better in performance.
I still didn't tested the callablestatement.
for above select query what/how do I write the stored procedure ?

Thanks.

rameshaa

ASKER

Hi
I did the bench mark using Statement, PreparedStatement,CallableStatement.
If the query is simple select with 2 or 3 where clause the Statement is the best choice !
If the query consists of more than 10 where clauses, group by, order by the Callable works really good.
But the preparedstatement mostly doing bad than statement !!! But the same PreparedStatement
works really awesome If I call inside the loop. (I mean using the same preparedstatement instance)

I ran the test code with varying where clause to avoid database caching.

Also I changed the driver from Type2 (app) to Type 4 (jcc) and I got the same result as above.
The performance factor for type 2 was better than type 4.
But I heard that type 4 (puer java)is generally better than type 2 since it avoid JNI layer, native client.

anyway, thats all.

Thanks

CEHJ

In that case, i'd refer you to my very first comment - that should make coding a little easier

grim_toaster

--> one more time, as I said, I don't have the same statement executing in a loop.
I understand this, the loop only represents multple requests for the same query. Think of it as 1000 users performing the query at the same time. Also, it allows for the java optimizer to kick in and optimize the java code as much as possible so that the tests are fairer.

Sorry about the Oracle info, didn't help you at all. I presumed from the link you provided that you were also using Oracle (each database would implement how it uses statements/prepared statements differently, so the best thing to do - which you have - is to benchmark yourself).

-->In that case, i'd refer you to my very first comment - that should make coding a little easier
Using the MessageFormat class would over complicate things a bit too much! Plus you would have to handle Dates in a database specific manner (although that is probably already being done with the current method).

-->But I heard that type 4 (puer java)is generally better than type 2 since it avoid JNI layer, native client.
There's no rule that says one is better than the other in all situations, the major drawback to a type 2 driver is that you need the database client installed. Which is not always possible (or wanted). If the benchmarks show a significant improvement in using the type 2 driver, then use it. You can always change it later (should be configurable!).

CEHJ

>>Using the MessageFormat class would over complicate things a bit too much!

Why so? - there's very little more to add and will be quite a bit easier than concatenating Strings or using StringBuffers. Of course, it won't have the Date formatting capabilities of PS, nor will it escape Strings automatically, but that's hardly the point