Link to home
Start Free TrialLog in
Avatar of aibdev
aibdev

asked on

Websphere connection pool settings

Hi,
We are currently having staleconnection issues with our production environment. We use Wiley monitor tool and we can see that stalls are building up on 1 jvm(total of 4) and then after 8minutes and 42/43 seconds we see the following outputted to the logs:
[9/18/09 11:43:23:716 IST] 00000626 ConnectionEve A   J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adapter for resource jdbc/CMSL.  The exception which was received is com.ibm.websphere.ce.cm.StaleConnectionException: Io exception: Connection timed out:java.sql.SQLException: Io exception: Connection timed out


The connection seems to become stale. Is this what this error means? Also what does the connection timed out IO Exception indicate? Our purge policy is set to entire pool so i'm guessing that the stale connection was encountered by the app it tried to purge the entire connection pool but timed out waiting on all the connections to finish so it could reset them. Is that correct?
Our settings are:
max conns = 100
min conns =10
purge = entire pool
pretest connection = false

We have pretest connection turned off. So from what I have read I seem to have a few options:
A) Set the minimum connection to 0 so that no connections are left open and hence they cannot become stale.
B) Set the purge policy to failedconnection only so that the entire pool does not need to be reset and only the stale one would be. Would this stop the stall requests building up?
C) Set pretest connection so that websphere tests the connection before it hands it to the app and therefore a staleconnection wouldn't be handed out. My issue with this setting is that if websphere is checking the connection before it hands it to the app and it finds it to be stale we are not back to the same problem but at webpsheres level now. It has to reset the entire connection pool and there is a possibility of stalls building up while it purges the entire pool?
or does websphere check the connection every x amount of seconds and keeps it open?

Could someone recommend or advise which setting would be the best one to go with?
Has anyone had the same issue and how did you resolve it?

Also can anyone explain the 8minutes 42 seconds? From the initial stall thread report to the staleconnection output to logs. The interval between these events is always 8mins 42 seconds.
Wiley report:
Error at 11:34:41.515 (18 Sep 2009)
Error Message: Stalled Transaction
Log file:
[9/18/09 11:43:23:716 IST] 00000626 ConnectionEve A   J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adapter for resource jdbc/CMSL.  The exception which was received is com.ibm.websphere.ce.cm.StaleConnectionException: Io exception: Connection timed out:java.sql.SQLException: Io exception: Connection timed out

Time diff is 8minutes 42 secs.
ASKER CERTIFIED SOLUTION
Avatar of HonorGod
HonorGod
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of aibdev
aibdev

ASKER

Thanks for your advice.

Would you recommend any of those options or is it just a matter of seeing which one works best for us?

If we set the minimum connection to 0, should we expect a performance overhead because the app has to open a new connection all the time?

"Q: Option "C" ... Is there a a possibility of stalls building up while it [the AppServer] purges the entire pool?
A: It is possible, but it depends upon the number of connections that need to be "cleaned up", and the time required to do so.  So, this means that is very dependent upon the speed of your system, how busy it is, what else is going on, etc.
"

"A: It is possible, but it depends upon the number of connections that need to be "cleaned up"
The number of connections needing to be cleaned up, does this not refer to the purge policy? So if it finds a stale connection it cleans the pool based on purge policy which is entire pool, not just the connections that need to be cleaned?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of aibdev

ASKER

On our test environment I enabled the pretest existing/ new pooled connections and restarted the environment, I left it over night and came in the next morning and requested a url.

Usually I would see the stale connection error:
The exception which was received is com.ibm.websphere.ce.cm.StaleConnectionException: Io exception: Connection timed out:java.sql.SQLException: Io exception: Connection timed out

But I didn't this morning instead i saw:
[9/23/09 8:56:36:821 IST] 00000030 ServiceLogger I com.ibm.ws.ffdc.IncidentStreamImpl initialize FFDC0009I: FFDC opened incident stream file /opt/WebSphere6-1/AppServer/profiles/AppSrv01/logs/ffdc/CMS_GMT2_00000030_09.09.23_08.56.36_0.txt
[9/23/09 8:56:36:931 IST] 00000030 ServiceLogger I com.ibm.ws.ffdc.IncidentStreamImpl resetIncidentStream FFDC0010I: FFDC closed incident stream file /opt/WebSphere6-1/AppServer/profiles/AppSrv01/logs/ffdc/CMS_GMT2_00000030_09.09.23_08.56

And when I examine these files I can see the following exceptions:
Exception = java.sql.SQLException
Source = com.ibm.ws.rsadapter.spi.WSRdbManagedConnectionImpl.clearStatementCache
probeid = 2169
Stack Dump = java.sql.SQLException: Io exception: Socket closed


I was just wondering if these are serious errors and does anyone know what is causing them?
It has resolved the stale connections but if this is a serious error I'll have to try the minimum connection = 0 instead of the pretest connections.

Thanks
When you say that you left your environment active overnight, was anything happening?
Or was it just sitting there idle?

If it was idle, then I suspect that the connections to the database timed out due to the lack of activity (which is not too surprising).

Did you see the article about Stale Connections that I pointed out above (http:#a25381812)?
Avatar of aibdev

ASKER

The jvms were running during the night but there was zero requests going to them. The last request was made at 5pm yesterday and the next request was at 8am this morning when i hit a webpage to see if i would still get the staleconnection errors.

I didnt get the staleconnection error i got the error i posted above. I was just wondering if anyone knew what this error was referring to?
Stack Dump = java.sql.SQLException: Io exception: Socket closed

Yes I read that article about the staleconns and some others around staleconns and that's where i got the possible solutions i mention in my first submission.

Socket closed? Just wondering if this is a problem or not?


I have attached the errorlog file with the socket closed error in it:
CMS-GMT2-00000030-09.09.23-08.56.txt
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of aibdev

ASKER

Is this a problem?

Stale connections causes our application build up stalled threads whilst the connections are being reset (entire pool), is this likely to cause the same problems?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of aibdev

ASKER

Hi,

I implemented a change yesterday morning in the connection pool.I enabled pretest connections for new and existing connections. I set the following:
Enable: Pretest existing pooled connections
Retry interval = 5seconds
Enable: pretest new connections.
number of retries=100
retry interval =3
pretest SQL string = Select 1 from DUAL

However the change was not successful. This morning we had the same problem where threads started stalling and building up until the JVM reached its max threads and eventually it spilled over on the webserver and it reported max clients.... This is the usual problem,

We eventually had to restart this JVM, as it did not resolve itself after the normal 8mins 42secs, im thinking that maybe the 3 seconds interval and 100retries could have possibly added 300 secs to the time allowed? But i couldn't leave it that long to see if it would resolve itself then.

A concern i had expressed above was that setting this pretest connection value to true would only move the problem to a higher level. So instead of the application recieveing the stale connection and having to reset it websphere would see the stale connection and have to reset it, and while it is resetting it the stalls would still continue to build up. This seems to be the case.

Here is snipet from the wiley reporting tool about the stall that caused the pile up.

"      Error at 09:05:34.260 (30 Sep 2009)
"      Frontends|Apps|cs.war|URLs|Default (0 ms)
"      Error Message: Stalled Transaction
"      Scheme: http
"      Server Name: www.aib.ie 
"      URL: /servlet/ContentServer/home
"      Frontends|Apps|cs.war|URLs|Default (11 ms)
"      JSP|_csX_5F_2057896878aib_5F_wrapper (11 ms)
"      JSP|_csX1681919883aib_5F_layout (99 ms)
"      Backends|cmsl 10.4.4.105-1522 (Oracle DB) (120 ms)
"      Backends|cmsl 10.4.4.105-1522 (Oracle DB)|SQL|Dynamic|Query|SELECT X FROM DUAL (120 ms)
"      Error Message: Stalled Transaction
"      SQL: SELECT X FROM DUAL




So the error occurred at the 09.05.34 and the error was a stalled transaction.
Over http. to the server www.aib.ie, which is our domain and website.
The url requested was the homepage /servlet/ContentServer/home.

Then on the homepage the first jsp to get called is the wrapper which in turn calls the layout page.
The layout page is where the first piece of work takes place, this is where the first call to the database would occur.
I can see a call to the database and then it goes into more detail on the call to the database...
It says "SELECT X FROM DUAL" was the call. Which is our pretest sql string.
This would imply when the layout.jsp requested something from the database, websphere pretested the connection and found it to be stale and had to reset the pool. When websphere was resetting the pool the stalls built up.

Would you agree with this?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks for the grade & points.

Good luck & have a great Thanksgiving.