Link to home
Start Free TrialLog in
Avatar of gublooo
gubloooFlag for India

asked on

Connection pool error - cannot get connection

Dear Experts,

I run an ecommerce website and I'm doing load testing on the site to see what load it can handle. When I run the case 1000 users in 1 hour, the site breaks down and I start getting the error messages below in the log files.

What settings do I need to change to take care of this. This is the current setup for my DB: I'm using tomcats DBCP connection pool

Thanks

<Context reloadable="true">
  <Resource name="jdbc/discountweb" auth="Container" type="javax.sql.DataSource" maxActive="100" maxIdle="16"
    maxWait="1000" username="xxxxx" password="xxxxx" driverClassName="com.mysql.jdbc.Driver"
    url="jdbc:mysql://localhost/xxxx?autoreconnect=true" />
</Context>



*****************************************************************************************
This is the error message:

ERROR 2009-01-15 15:35:56,201 [QuartzScheduler_DS-govasool-labs.com1231967805483_ClusterManager] - ClusterManager: Error managing cluster: Failed to obtain DB connection from data source 'springNonTxDataSource.DS': org.apache.tomcat.dbcp.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
org.quartz.JobPersistenceException: Failed to obtain DB connection from data source 'springNonTxDataSource.DS': org.apache.tomcat.dbcp.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object [See nested exception: org.apache.tomcat.dbcp.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object]
    at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:1454)
    at org.quartz.impl.jdbcjobstore.JobStoreCMT.doCheckin(JobStoreCMT.java:1374)
    at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.manage(JobStoreSupport.java:2378)
    at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.run(JobStoreSupport.java:2409)
* Nested Exception (Underlying Cause) ---------------
org.apache.tomcat.dbcp.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
    at org.apache.tomcat.dbcp.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:104)
    at org.apache.tomcat.dbcp.dbcp.BasicDataSource.getConnection(BasicDataSource.java:880)
    at org.springframework.jdbc.datasource.UserCredentialsDataSourceAdapter.doGetConnection(UserCredentialsDataSourceAdapter.java:161)
    at org.springframework.jdbc.datasource.IsolationLevelDataSourceAdapter.doGetConnection(IsolationLevelDataSourceAdapter.java:127)
    at org.springframework.jdbc.datasource.UserCredentialsDataSourceAdapter.getConnection(UserCredentialsDataSourceAdapter.java:132)
    at org.springframework.scheduling.quartz.LocalDataSourceJobStore$2.getConnection(LocalDataSourceJobStore.java:125)
    at org.quartz.utils.DBConnectionManager.getConnection(DBConnectionManager.java:111)
    at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:1431)
    at org.quartz.impl.jdbcjobstore.JobStoreCMT.doCheckin(JobStoreCMT.java:1374)
    at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.manage(JobStoreSupport.java:2378)
    at org.quartz.impl.jdbcjobstore.JobStoreSupport$ClusterManager.run(JobStoreSupport.java:2409)
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
    at org.apache.tomcat.dbcp.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:958)
    at org.apache.tomcat.dbcp.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:96)
    ... 10 more
Avatar of crossdev
crossdev
Flag of United States of America image

you need to make sure that the max connections in MySQL match the availble in your pool connection.  The default is 100.  Do you have any other pools hitting the same database?  If you have 2 servers each with 100 pool connections and a database that can only service 100 connections you'll have this problem.

Here is a link on tunning http://www-css.fnal.gov/dsg/external/freeware/mysqlTuning.html
Avatar of gublooo

ASKER

Thanks crossdev - I will look into that link.
Meanwhile I read in one of the forums to set MaxActive to 0 and MaxIdle to -1

Not sure how this would help.
Any idea?
Thanks
Maybe there are connections left unclosed (that is, not returned to the connection pool).

I'd advise you to investigate your code against such situations, especially exception cases..

I added two sample methods that i use to prevent these situations

	private void closeConn(Connection conn, CallableStatement cstmt) {
		if (cstmt != null) {
			try {
				cstmt.close();
			} catch (Exception e) {
				e.printStackTrace();
			}
		}
		if (conn != null) {
//			rollback() or commit() if no auto commit
			try {
				conn.close();
			} catch (Exception e) {
				e.printStackTrace();
			}
		}
	}
 
	public void testConn() throws Exception {
		Connection conn;
		CallableStatement cstmt;
		try {
			conn = getConnection();
			cstmt = conn.prepareCall(sql);
			...
			...
			closeConn(conn, cstmt);
		} catch (Exception e) {
			closeConn(conn, cstmt);
			throw e; // or e.printStackTrace();
			
		}
	}

Open in new window

Make sure that no one is storing the connections in a cache or static variable.  I've seen this.  Its ugly and defeats the pooling.  All connections should be obtained, used and returned to the pool in the same method.

To anwser your followin question.  The idea of pooling is that in part you have the ability to tune the pool.  Rather than open the pool up to unlimited connections you would be better served to understand the application and its needs and tune it appropiately.
Avatar of gublooo

ASKER

Thanks ali/crossdev

I am pretty sure that all connections are being released back to the pool. But are there any tools that I can use to track that say a user does a search operation on the site - basically be able to track that a connection has been taken from the pool - a search has been performed - and the connection being released back to the pool - some kind of a live monitoring system that we can run and play on the website and see - after which operations if any are there connections left open with out being released back.

Crossdev - These are the settings in mysql
max_connections = 250
wait_timeout = 172800
interactive_timeout = 172800

And these are my jdbc settings:
maxActive="200" maxIdle="20" maxWait="3000"

The load testing works fine when I test 100 users in 1 second all the way to 400 users in 200 secs.
But when I start testing 1000 users in 3600 seconds - I start getting the error message I posted in my question.

So what settings should I be changing to accommodate this test case.
Thanks
Are you monitoring the connections on the DB during the performance testing.   THat will tell you alot.
Avatar of gublooo

ASKER

No I'm not - is there any tool I can use to monitor this? Basically I have outsourced testing and I was just provided with the results. The team is using JMeter to perform load testing and they are testing for various cases like 100 users in 1 second all the way to 1000 users in 3600 seconds.

But I dont think they are monitoring the database?
I wrote one, cause I could not find a good one.  However, Its property of the company I work for.

You can get the value by doing a show status.  http://dev.mysql.com/doc/refman/5.1/en/show-status.html

How many app servers do you have hitting the database?  If its 1:1 then the number of connections are ok.  If this is the case you may have connections closing in the pool.  If this is the case then you'll want to set the testOnReturn to true so that connections which get closed are reopenned.
See : http://commons.apache.org/dbcp/configuration.html

Avatar of gublooo

ASKER

THanks crossdev,

Currently I have only 1 app server hitting 1 database - so we are good there.
About your suggestion of connections closing in the pool - why/when does that happen. If that was happening - wont it impact other test cases as well - like when I ran 100 users in 1 sec or 400 users in 200 secs - I did not get this error - it only starts happening for 1000 users in 3600 seconds.

But I will try that testOnReturn value and get back to you. It'll take me a couple of days to inform the company to make this change and rerun the test cases - but I'll definitely be back in 2 days :)

Thanks

ASKER CERTIFIED SOLUTION
Avatar of crossdev
crossdev
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial