asked on

Windows Error: 8: Exec format error, ora-12500 connection refused

We host Oracle version 8.1.7.2.1 on Windows2000 box for our customer who connects through JRun application server. The problem is we run into the following error very frequently and we are required to restart he instance 3 times in a week. The error is

java.sql.SQLException: Io exception: Connection refused(DESCRIPTION=(TMP=)(VSN
NUM=135290880)(ERR=12500)(ERROR_STACK=(ERROR=(CODE=12500)(EMFI=4))(ERROR=(CODE
=12540)(EMFI=4))(ERROR=(CODE=12560)(EMFI=4))(ERROR=(CODE=510)(EMFI=4))(ERROR=(
BUF='32-bit Windows Error: 8: Exec format error'))))

I have reduced the shared_pool_size to 100M but that has not helped. The server has 4GB RAM. As configuring MTS is not a very good option on 8i, I have not tried doing that. Also, enabling dead connection cheking is not a good idea as we know there are no dead connections and enabling it would only add additional stress on oracle.

We need to find a solution to this problem at the earliest. Any help would be greatly appreciated. Thanks in advance.

--Raj

slightwv (䄆 Netminder)

How many concurrent connections are common?

Does the app free up unused connections?

I'm thinking that MTS may be your only way out.

Even though the server has 4 Gig of ram, Oracle can only use up to 2 gig in Windows (limited ability to go to 3 gig but this has issues).

Do you have access to Metalink? If so, check out note: 224403.1

If not, I'll post the contents (Typically they don't post well here).

chundi_gus

ASKER

Thanks for the prompt reply.

How many concurrent connections are common?

about 50

Does the app free up unused connections?

yes

I looked at the metalink note that you suggested, I have tried all but MTS and /3GB switch options. I am not sure how to check for any memory leaks?

chundi_gus

ASKER

Does it have something to do with how oracle manages memory in version 8.1.7.2.1? I cannot post a TAR in metalink, but I can only
read from it. Thanks.

ASKER CERTIFIED SOLUTION

slightwv (䄆 Netminder)

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

chundi_gus

ASKER

This problem is spreading. We have another customer who is reporting the same error on oracle 8.1.7.2.1. Please help. thanks.

chundi_gus

ASKER

So are you saying it is the JRun that is causing this? If so, do you think setting SQLNET.EXPIRE_TIME would help? If so, what is the value that it needs to be set to?

slightwv (䄆 Netminder)

Not familiar with JRun and how it does what it does. Are you seeing high connection counts in v$session? Do you ever see them decrease?

Not sure if EXPIRE_TIME will help or not. We got rid of our 817 DB w/o ever solving the problem.

chundi_gus

ASKER

Do you think creating a script which checks for dead processes if any, and kills them, would help? I understand from your comment that upgrading to a later version of oracle is a possible solution to this problem. Let me know if it is worth the effort to create such a script in the meantime, so we exhaust all options before choosing to upgrade.

slightwv (䄆 Netminder)

I would at least go to the latest 817 patchset. I believe it was 8.1.7.4.

What processes? In windows, all connections are contained in the oracle.exe process.

To better enable me to assist, please answer the previous questions:
What is your SGA set to (show sga)?
Are you seeing high connection counts in v$session?
Do you ever see them decrease?

schwertner

You have to think about increasing the cursors.
You have to increase also the nunber of the processes.
Another issue is the swapping of dead processes.
Put in SQLNET.ORA the line
SQLNET.EXPIRE_TIME = 20
This will clear the dead sessions every 20 minutes.

Estimate the volume of the SGA.
Use the command
show sga
The SGA should not be greater then 50% of the RAM.
Turn of all other applications on the server.
Be aware that Oracle on Windows need restart of Windows once weekly.

slightwv (䄆 Netminder)

>>Be aware that Oracle on Windows need restart of Windows once weekly.

Disagree. I have only had to reboot frequently due to bugs in Oracle (memory leaks and such). Once everyting works as designed, there is no need to reboot a Windows DB server any more frequently than a UNIX one. My 10g DB server gets a reboot once a month if it's lucky. I will give you that UNIX is easier to clean up messes when Oracle mis-behaves.

schwertner

I experience often 100% CPU usage. Possibly it comes from fragmentation of the RAM and swapping of the RAM - every developer uses his own Application Server and every application Server opens 15-20 sessions (I warn, declare war, but it is hard to control permanently). Shutdown of computers, shutdown of application servers leaves dead sessions, create new one and finally it seems the RAM gets fragmented. I have never experienced memory leaks and control hard the situation on Linux - there are wrong commands in Linux itself which give the feeling of memory leaks.
There are many warnings on Windows - keep the SGA under 50% of RAM and restart Windows (Markgeer is the Master in these issues) - it seems he has a great experience as ORA WIN DBA.

slightwv (䄆 Netminder)

I will agree with you in a development environment. It is impossible to control/teach developers!!!!!!!

Mehul Shah

How much Virtual memory have you configured.?

'32-bit Windows Error: 8: Exec format error'

C:\>net helpmsg 8

Not enough storage is available to process this command.

Like in several other OSes, "not enough storage" doesn't mean lack of disk space. It means lack of virtual memory instead (including swap space).

chundi_gus

ASKER

Thanks for all your inputs. I have an update. The total sga is 1.4GB and processes is set to 300. I have noticed that a lot of sessions are inactive for long periods of time which are from JDBC thin client. I now have a script in place that kills sessions that are inactive for more than 3 hours. Also, I have created idle_time 180 profile for the user. I know SNIPPED session resources are released right away, however I am not sure if this is the case even with KILLED sessions. Any ideas on this?

Mehul Shah

Oracle PMON constantly monitors all the processes and if it finds any process which is killed or the connection is broken(infant). It will rollback all the uncommitted transactions of that process and release all the locks.

chundi_gus

ASKER

I see that the sessons that are marked "SNIPED" stay like that in v$session for very long periods of time. In fact, by the end of business today, I didnot see any cleanups from v$session at all. Is there something I can do to remove the shadow process to release resources held? I understand they will be eventually cleaned up by PMON, however, in the meantime, if the resources held by the session are released, that would be helpful for oracle to process more requests/connections. Can i set SQLNET.EXPIRE_TIME=180 in addition to the profile idle_time 180? Would that be helpful or does it add additional overhead?

SOLUTION

schwertner

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

chundi_gus

ASKER

Thanks for the explanation on SNIPED sessions and Dead sessions. The problem was that the code was not doing a good job of closing the connections properly, so a lot of sessions we inactive for long periods of time. I now have a batch script in place which kills inactive sessions using ORAKILL and so far looks like it is working! Thanks for pointing me in the right direction.

schwertner

There are also automatic tools to do this.
Open OEM and look in security. Go to Profiles.
Add a new profile IDLETIME and restrict the session time.
After that assign this profile to the particular users in the Security session.
Be prepared to be overload with complaints for closed sessions :-)))))

schwertner

I fact I am speaking about 9i, but hope it works on 8i.

chundi_gus

ASKER

I already tried assigning profile IDLE_TIME to the user. That caused many sessions to be marked as SNIPED as expected, however, it didn't look as if the resources were being freed up though. Maybe because there were the shadow processes that were left behind which were not releasing resources for a long time. I also donot have an answer to the question of why the connections marked SNIPED were not attempted to be reused, at which point they will be removed from v$session. So, I resorted to scripting the batch file to terminate the inactive sessions from the OS. Thanks!
--Manjula Kshirsagar

chundi_gus

ASKER

oh, i forgot to mention about the ORA-04030 out of process memory errors that came out of assigning the profile! It must have added sufficient overhead to lead to those errors. So, profile was not the right thing to do in my case.

schwertner

I also stopped using profiles and IDLE_TIME.
In my environment the process of cleaning the DEAD sessions works fine and solved my issues even by 2 MB RAM.

Another solution is to increase the speed of the transactions. The usage of BIND VARIABLES can dastically increase
the performance of the code.

FYI
Shared SQL area may be further utilized for not only identical but also for some-what similar queries by setting the initialization parameter CURSOR_SHARING to FORCE. The default value is EXACT. Do not use this parameter in Oracle 8i, as there is a bug involved with it that hangs similar query sessions because of some internal processing. If you are on 9i, try out this parameter for your application in test mode before making changes in production.

chundi_gus

ASKER

We ran into this problem today. Everything was working fine until this evening when the database just just hang for no reason. All it said was "ora-12500 cannot start dedicated server process". I couldnot find any one SQL statement that caused this problem by looking at v$ssqlarea. The database is running in archivelog mode, if that is of any importance here. I had to restart the db. I donot have any clue as to what caused the db hang. Any help would be greatly appreciated.

schwertner

I also run in this problem last spring and summer (Windows 2000, Oracle 9i, 2 instances on the box, 2GB RAM).
After I asked ultimativelly our developers to reduce the connections of the Apl. Servers from 20 to 5
and add SQLNET.IDLE_TIME we stopped to experience this problem.
Seldom the CPU activity run up to 100% and the connection are very slow.
In this case the only fix is to restart the box.
Seems that the memory get fragmented and W2K uses paging and swapping.
So (as also other users say) you have to restart the box periodically (e.g. once at week).
If this case happen again measure the CPU load - see if it is 100%.

chundi_gus

ASKER

The alet log showed various "Failed to archive.." messages. But ultimately those were archived successfully. I think the db started to slow down when this happened. Is this because of frequent log switches or redolog size being small? If so, would proper sizing of the redologs help?

chundi_gus

ASKER

Update: I changed the redolog size to 10M from 1M. Got to wait and see if that helps.

schwertner

Yes, increase the size of online Redo Logs.
Also add additional Redo groups.
They are 3 by default and aren't multiplexed.
Multiplex them (adding additional members to the groups).
This will make the installation secure in the case of block corruption.