dipinci
asked on
WebSphere Application Cannot re-establish connection after DB restart
Hi,
I have applications running on WAS 6.0.2.31cluster which has been configured with Oracle RAC 10G.
Currently there was a problem with the Oracle 10g nodes and both went down, Once the Db is up WAS applications and SIBus couldnt re-establish the connetion, to fix this we were forced to re-start the Applicaiton servers.
Is there any fix/workaround to re-establish connection once the DB is back online.
Any suggestions/help is really appreciated.
Many Thanks
Dipin
I have applications running on WAS 6.0.2.31cluster which has been configured with Oracle RAC 10G.
Currently there was a problem with the Oracle 10g nodes and both went down, Once the Db is up WAS applications and SIBus couldnt re-establish the connetion, to fix this we were forced to re-start the Applicaiton servers.
Is there any fix/workaround to re-establish connection once the DB is back online.
Any suggestions/help is really appreciated.
Many Thanks
Dipin
ASKER
Ys I had this situation couple of times, I have always tried to restart the applications. Test connection from Admin console is succeful once the DB is online.
But did you try stopping and starting the application (not the AppServer) to clear up the issue?
Or, did you try testing the connection from the datasource configuration panel when the issue occurred?
Or, did you try testing the connection from the datasource configuration panel when the issue occurred?
ASKER
Restarted the application Server because I have SIBus associated with the Application Server. Tested the datasource connection from the admin console
What version of WebSphere is being used, and on what Operating System?
ASKER
WAS 6.0.2.31 and OS AIX 5.3. WAS version is mentioned in the question.
Sorry about that. You are correct. I'm just in the habit of asking for the version. I apologize.
Anyway, do you have verbose logs associated with the attempts to re-establish connection to the DB after they all get recycled?
Anyway, do you have verbose logs associated with the attempts to re-establish connection to the DB after they all get recycled?
ASKER
Yes, we have. I have a doubt how this Verbose GC related with this problem.
Verbose logs are not verbose GC logs. Verbose just means "lots". The default log level for WebSphere is "informational messages only", it would appear that we need more "verbose" messages, so that we can tell What WebSphere is trying to do, and what happens through out the process.
Does this make sense?
Does this make sense?
ASKER
I know that default log level is info, but there is no verbose log level. It will be trace fatal,trace,error....
I have the default logging enabled
I have the default logging enabled
by verbose, I meant "more than the default". That's all.
What messages do you get when you encounter the problem?
What messages do you get when you encounter the problem?
ASKER
Below message logged
Server-1
4/27/09 23:11:01:268 AST] 0000aa8f ExceptionInte E com.qatarairways.ibe.platf orm.interc eptor.Exce ptionInter ceptor afterThrowing MESSEGE ERRORKEY= module.connection.prob ,DATABASE ERROR CODE :17002
org.springframework.dao.Da taAccessRe sourceFail ureExcepti on: Hibernate operation: Cannot open connection; SQL [???]; Io exception: The Network Adapter could not establish the connectionDSRA0010E: SQL State = null, Error Code = 17,002DSRA0010E: SQL State = null, Error Code = 17,002; nested exception is java.sql.SQLException: Io exception: The Network Adapter could not establish the connectionDSRA0010E: SQL State = null, Error Code = 17,002DSRA0010E: SQL State = null, Error Code = 17,002
java.sql.SQLException: Io exception: The Network Adapter could not establish the connectionDSRA0010E: SQL State = null, Error Code = 17,002DSRA0010E: SQL State = null, Error Code = 17,002
Server-2
[4/27/09 23:11:14:794 AST] 00017f18 ConnectionEve A J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adaptor for resource com.qatarairways.ibe.datas ource. The exception which was received is com.ibm.websphere.ce.cm.St aleConnect ionExcepti on: Io exception: Connection reset
[4/27/09 23:11:14:883 AST] 00017f18 AbstractBatch W org.hibernate.jdbc.Abstrac tBatcher closeQueryStatement exception clearing maxRows/queryTimeout
com.ibm.websphere.ce.cm.Ob jectClosed Exception: DSRA9110E: Statement is closed.
Server-1
4/27/09 23:11:01:268 AST] 0000aa8f ExceptionInte E com.qatarairways.ibe.platf
org.springframework.dao.Da
java.sql.SQLException: Io exception: The Network Adapter could not establish the connectionDSRA0010E: SQL State = null, Error Code = 17,002DSRA0010E: SQL State = null, Error Code = 17,002
Server-2
[4/27/09 23:11:14:794 AST] 00017f18 ConnectionEve A J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adaptor for resource com.qatarairways.ibe.datas
[4/27/09 23:11:14:883 AST] 00017f18 AbstractBatch W org.hibernate.jdbc.Abstrac
com.ibm.websphere.ce.cm.Ob
The Server-1 information points to the application exception, but the Server-2 is a little more interesting.
The following document from the IBM website has proven very helpful in identifying, and resolving "Stale Connection" issues.
http://www.IBM.com/support/docview.wss?rs=180&uid=swg21063645
It also points to a white paper that is good.
The following document from the IBM website has proven very helpful in identifying, and resolving "Stale Connection" issues.
http://www.IBM.com/support/docview.wss?rs=180&uid=swg21063645
It also points to a white paper that is good.
ASKER
The purge policy set for the Connection pool is for entire pool, not sure about the code level change.
I am trying to find out some configuration/workaround to retry the connection from the Application Server.
I am trying to find out some configuration/workaround to retry the connection from the Application Server.
ok, please let me know what I can do to help.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Did you try stopping and restarting the application?
Did you try "testing" the connection?