We don't know exactly what URL causes the problem and I'm not sure how to limit the connections. The OC4J is multithreaded. We suspect that its related to struts...
Main Topics
Browse All TopicsAfter some use (usually a couple of hours), the OC4J_Portal can no longer service requests and an HTTP 500 is generated. We've tried increasing the number of processes and this does not help. We've tried changing the threading and memory models as well.
Config is:
subnet 12 (internet):
- lbprd02/4 (aix): ibm websphere edge server load balancers
- - passes www.internetdomain.com:80 to webcachecluster1:7778
- - passes login.internetdomain.com:7
- fecprd01 (rh): mid-tier 10g (webcachecluster1), sso 10g
- fecprd02 (rh): mid-tier 10g (webcachecluster1), sso 10g
subnet 11 (intranet):
- lbprd01/3 (aix): ibm websphere edge server load balancers
- - passes www.intranetdomain.com:80 to webcachecluster2:7777
- feaprd01 (rh): mid-tier 10g (webcachecluster2) (uses SSO on subnet 12)
- feaprd02 (rh): mid-tier 10g (webcachecluster2) (uses SSO on subnet 12)
subnet 42 (intranet):
- cslprd01 (aix): infra 9.2.0.3 w/ portal schemas portal+portala,
- - DB Server 9.2.0.4 with customer dbs
We're getting the following errors in the apache error_log just before recieving HTTP 500 errors:
oc4j_socket_recvfull timed out
(4)Interrupted system call: MOD_OC4J_0038: Receiving data from oc4j exceeded the configured "Timeout" value and the error code is 4.
MOD_OC4J_0054: Failed to call network routine to receive an ajp13 message from oc4j.
I suspect an error in ajp13 is consuming connections or some other resource and never releasing them. However, changing the OC4J to connectionless (using Oc4jCacheSize 0) does not seem to help. This error resembles the mod_jk error http://nagoya.apache.org/b
If needed I have an example of the errors when Apache is in mode debug.
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
The app is a jsp / bc4j app deployed in BI_Forms providing for Portal. We can already see the error above, but here is the error in detail:
[Mon Mar 22 14:45:57 2004] [warn] [client 123.45.678.30] oc4j_socket_recvfull timed out
[Mon Mar 22 14:45:57 2004] [warn] [client 123.45.678.30] oc4j_socket_recvfull timed out
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] (4)Interrupted system call: MOD_OC4J_0038: Receiving data from oc4j exceeded the configured "Timeout" value and the error code is 4.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] (4)Interrupted system call: MOD_OC4J_0038: Receiving data from oc4j exceeded the configured "Timeout" value and the error code is 4.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0054: Failed to call network routine to receive an ajp13 message from oc4j.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0054: Failed to call network routine to receive an ajp13 message from oc4j.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0033: Failed to receive an ajp13 message from oc4j.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0033: Failed to receive an ajp13 message from oc4j.
[Mon Mar 22 14:45:57 2004] [warn] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0078: Network connection errors happened to host: fecprd01.internetaccess.my
[Mon Mar 22 14:45:57 2004] [warn] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0078: Network connection errors happened to host: fecprd01.internetaccess.my
[Mon Mar 22 14:45:57 2004] [debug] oc4j_proc_tbl.c(1840): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0228: Process (opmnid=219349677) removed from index list of worker OC4J_Portal. Global index:3, local index:0, remaining entries in local index list: 0
[Mon Mar 22 14:45:57 2004] [debug] oc4j_proc_tbl.c(775): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0226: Freed process (opmnid=219349677) from mod_oc4j proc table.
[Mon Mar 22 14:45:57 2004] [debug] oc4j_proc_tbl.c(1746): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0175: After removing entry (index = 3 with (opmnid=219349677) from mod_oc4j proc table, first_free_slot: 2 and nelts: 4.
[Mon Mar 22 14:45:57 2004] [info] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0123: Mark network worker: OC4J_Portal_15 to be in_error_state because it failed to service the request and it is recoverable.
[Mon Mar 22 14:45:57 2004] [debug] oc4j_proc_tbl.c(1647): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0161: Failed to find the process with (opmnid=219349677) in internal process table to remove.
[Mon Mar 22 14:45:57 2004] [info] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0123: Mark network worker: OC4J_Portal_15 to be in_error_state because it failed to service the request and it is recoverable.
[Mon Mar 22 14:45:57 2004] [debug] oc4j_util.c(591): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0133: Request cookie's jsessionid is 8123a41ece8452b49abd681444
[Mon Mar 22 14:45:57 2004] [debug] oc4j_util.c(607): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0134: Decoded oc4j jvm route is fecprd01.internetaccess.my
[Mon Mar 22 14:45:57 2004] [debug] oc4j_util.c(591): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0133: Request cookie's jsessionid is 8123a41ece8452b49abd681444
[Mon Mar 22 14:45:57 2004] [debug] oc4j_ac_worker.c(703): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0139: Found a matched network worker: OC4J_Portal_15 with host: fecprd01.internetaccess.my
[Mon Mar 22 14:45:57 2004] [debug] oc4j_util.c(607): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0134: Decoded oc4j jvm route is fecprd01.internetaccess.my
[Mon Mar 22 14:45:57 2004] [info] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0151: Network worker: OC4J_Portal_15 with host: fecprd01.internetaccess.my
[Mon Mar 22 14:45:57 2004] [debug] oc4j_ac_worker.c(703): [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0139: Found a matched network worker: OC4J_Portal_15 with host: fecprd01.internetaccess.my
[Mon Mar 22 14:45:57 2004] [info] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0151: Network worker: OC4J_Portal_15 with host: fecprd01.internetaccess.my
[Mon Mar 22 14:45:57 2004] [warn] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0187: Failed to find an oc4j process within destination:[OC4J_Portal] and island id: default_island for session failover.
[Mon Mar 22 14:45:57 2004] [warn] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0187: Failed to find an oc4j process within destination:[OC4J_Portal] and island id: default_island for session failover.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0150: Failed to deterministicly find a failover oc4j process for session request for island: default_island for destination: OC4J_Portal.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0150: Failed to deterministicly find a failover oc4j process for session request for island: default_island for destination: OC4J_Portal.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0119: Failed to get an oc4j process for destination: OC4J_Portal
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0119: Failed to get an oc4j process for destination: OC4J_Portal
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0013: Failed to call destination: OC4J_Portal's service() to service the request.
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] MOD_OC4J_0013: Failed to call destination: OC4J_Portal's service() to service the request.
[Mon Mar 22 14:45:57 2004] [debug] mod_osso.c(3846): [client 123.45.678.30] [ecid: 195348896995,3]
[OSSO] ===== REQUEST =====
[OSSO] POST /app/classes HTTP/1.1
[Mon Mar 22 14:45:57 2004] [debug] mod_osso.c(3846): [client 123.45.678.30] [ecid: 195348896995,3]
[OSSO] ===== REQUEST =====
[OSSO] POST /app/classes HTTP/1.1
[Mon Mar 22 14:45:57 2004] [debug] mod_osso.c(337): [client 123.45.678.30] [ecid: 195348896995,3]
[OSSO] D26: process_idle_time()
[Mon Mar 22 14:45:57 2004] [debug] mod_osso.c(718): [client 123.45.678.30] [ecid: 195348896995,3]
[OSSO] D03: update_request()
[Mon Mar 22 14:45:57 2004] [error] [client 123.45.678.30] [ecid: 195348896995,3] File does not exist: /9IAS_MIDDLE/app/oracle/pr
Hi, sorry I haven't updated this question. I haven't forgotten about it, but I believe the issue has been resolved and am attempting to find out what was done (I've left the project). I thought it would be more interesting to find out the exact solution used.
If I haven't a solution in four days I have no issue with the points going to vzilka and thank him as well.
I'm hitting the exact same problem and oracle support aren't too successful so far (its been a week at priority 1).
It looks like OC4J hangs. It appears to stop listening on the default port 3301.
We've been going thread dumps on the OC4J process and all the threads are locked waiting for various things)
Question is why doesn't it use more threads and why isnt it listening on the port.
So anyone know the answer to this question?
I saw the guys on the team in August and asked them how they resolved the problem and they said that they applied a patch associated provided by Oracle and also increased some resource associated. I can't remember which and I'll ask the engineer to look at this post. The metalink iTar that we opened should be visible if you search for this, but I'm not sure.
this isnt really a solution is it?
They applied 'a patch' and increased 'a resource'
the iTAR 'should' be visible.
I've already fixed this - no thanks to this website.
I joined experts exhange as I it promised a solution which it didnt have.
I cancelled my subscription immediately and am still being charged by this website.
THIS SITE IS A CON.
I have to agree that I rarely get answers to the questions I post here. I think it has to do with the the EE mechanisms. Unanswered questions just disappear off the lists and nobody sees them.
Anyway. I was at another clients site last week and saw the same errors in 10g. There is a metalink document with this error now. Basically they say that the problem is often related to the OHS setting KeepAlive and a bug in IE5 and IE6 with POSTs. Basically it says to patch IE or turn KeepAlive off.
How did you solve the problem?
Oh ta, I thought it was just me!
there are a few problems:
Yes you are right, you get occasional errors chucked up from some IE problem, so you have to switch keepalive off.
there was also a problem with a firewall (the Application servers are in a DMZ) timing out JDBC connections in the pool, OC4J didnt know they were timed out, when load increased and it tried to use them again, ALL the threads hung - locked on the thread trying to use a dead JDBC connection.
we also had lots of application errors (max open cursors, java errors) which are seen as timeouts by apache. It causes the OC4J process to by dynamically removed from the routing list, so you've got no process to send your requests to (until its added back into the list again).
So the Actual solutions are:
get your application right
switch off firewall timeout between App server and DB, or use keepalive on the db listener, or use a connection pool of 0.
We are currently using static OC4J routes to stop the dynamic removal.
regards,
Mark
Business Accounts
Answer for Membership
by: vzilkaPosted on 2004-03-24 at 10:48:04ID: 10670417
Can you decrease the number of connections to the AJP13 to 1, and then check the URL that creates the problem?