Oracle 9.2.0.6/Linux x86_64 - deferred commit causes ORA-08177

There's a procedure which basically deletes rows containing a given key.
Depending on the key up to 40 tables are involved in a run. In each
table there may be between 2 and 65,000 rows to be deleted. Altogether
about 130,000 rows are deleted in one run from different tables. Since
they are all bound to the given key the author of the procedure has
deferred the COMMIT until the very end, after all DELETE lines in order
to maintain the atomicity of this transaction.

However, this deferred COMMIT seems to increase the processing time
by a factor of about 4. If I put a COMMIT after each of the 40 or so
DELETE statements it takes a quarter of the usual time to accomplish
the task (30 seconds instead of 2 minutes).

Although I can't see how any of the DELETE statements could possibly
fail and compromise the integrity of the data, short of a black-out which
never occurred in the last 5 years, the atomicity of this transaction is
rather fetishized as untouchable.

The relatively long execution time itself causes problems. In particular,
if this procedure is invoked by more than one user at a time, most often
one of both, usually the latter, will fail with error
"ORA-08177: can't serialize access for this transaction".

Can someone tell me if it is normal for the deferred COMMIT to have
such a strong impact on the execution time?

What can be done to alleviate the impact ?

Is there a straight way to prevent simultaneous multiple execution of
a procedure ?

I suppose I should value the answers at thrice the 500 points.
dkrnicAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

RCorfmanCommented:
Waiting to commit is normal and I'm surprised it is causing this much problem. It may be an issue with rollback or something similar. I'm not a dba so this part is not in my expertise. I can answer the others though with some ideas.

First, you can serilize the procedure online.
If you put dbms_lock inside the procedure, it can only be executed one at a time I don't think this is the best solution though, but it will work.
The lock name can be anything that is unique, I use this on AQ callbacks and usually use the queue name in this case, we'll use the procedure name.

create or replace procedure deleter as
 lock_handle   varchar2(128);
 dummy         pls_integer;
begin
 dbms_lock.allocate_unique('deleter',lock_handle);
 dummy := dbms_lock.request(lock_handle);
 --- do other processing here.
 dummy := dbms_lock.release(lock_handle);
end;
/

This will force serialization of this routine, so though multiple people can call it at once, it will force each to wait till the prior person completes before allowing them past the dbms_lock.request.

IF the user doesn't need to have the deletion complete before they move on, I would strongly recommend that you look at using Oracle AQ. Advanced Queues, AQ, are a way to have an asynchronous process. What happens, is when the user indicates they want to delete an 'author', when they commit, rather than actually doing the deletes, you add a record (enqueue) to the deletion queue... this enqueue message is what is commited.

Following behind that, Oracle will read the queue and actually perform the deletions and commits. You serialize this portion so if you have multiple people delete all at once, each one comes back immediately and then Oracle follows up with the actual deletions. Now it doesn't matter if it takes 10 seconds or 5 minutes. The user is free to move on and will be assured that the deletion will occur.

I believe this is the perfect situation for this.  I'm actually doing a 2 hour lab on this at the upcoming collaborate '06 conference this year.
0
dkrnicAuthor Commented:
What happens with the lock if an instance of delter which has it fails
before it releases it ?

The queuing sounds good but it looks like the effect may be deferred.
The deletion is usually just the foreplay for a new instantiation of the
same job, so if the deletes are only queued up it may create some
confusion, right ?
0
RCorfmanCommented:
Yes, if it can't be asynchronous, then that is a problem.

To avoid not having the lock released, you can wrap all the logic in a when others.

create or replace procedure deleter as
 lock_handle   varchar2(128);
 dummy         pls_integer;
begin
 dbms_lock.allocate_unique('deleter',lock_handle);
 dummy := dbms_lock.request(lock_handle);
 --- do other processing here. (may include other exception handling blocks)
 dummy := dbms_lock.release(lock_handle); -- last statement in normal outer block
when others then -- release lock on exception too
  dummy := dbms_lock.release(lock_handle); --- only called on unhandled other exceptions.
  raise;  -- re-raise the exception for outer block handling. just wanted to release the lock.
end;
/
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

RCorfmanCommented:
The effect of this is that if 2 people launch the 2 minute delete process, one of them will wait two minutes for their delete, then the next will wait 4 minutes, two for the "other person's" delete, then two for their own.

The lock works for synchronous transactions, the AQ only works if you can live with Asynchronous processing. Sometimes that isn't posible in which case it ins't a good solution.
0
dkrnicAuthor Commented:
Mr. RCorfman,

you've just earned your prize. But before I close the thread,
could you go in some more detail about the problem?

I would expect a relational database to take care of such things
automatically. I mean, as each of these tables gets some rows deleted
these rows should be locked somehow until either a COMMIT or a
ROLLBACK is executed. So I would expect that if 2 users start deleting
2 different jobs they should either be able to do it independently since
they are deleting 2 mutually exclusive sets of rows, or if for some
reason the whole table gets locked the latter one should patiently wait
for his turn to do the deletions - there is no NO WAIT clause anywhere
in these procedures, so why do we get the error ORA-08177 at all ?

I've been doing some tests on my own on an isolated test instance.
Instead of just deleting the rows in tables I first queried if there are
any matching records and then go on to the deletion only if there are
(some tables might for some jobs be void of corresponding job rows).
I also preceded the DELETE statemen by a SELECT ... FOR UPDATE
query without the OF clause, something like:

   select job from aTable where job=jobno for update;

With this setup I executed hundreds (1,300 to be precise) of deletions
from 2 separate processes, whereby one was deleting the odd jobnos
and the other was doing the even jobnos, without ever running into a
deadlock or ORA-08177.

If you can't answer these questions with confidence let me know so I
can close this thread and open a new one. I'll have to reformulate the
first question again for someone who can throw some light on it.
0
RCorfmanCommented:
I can answer with tentative cautiousness... I would never claim that I understand everything related to this, hopefully I have my hands around a significant fraction though...

I completely agree that this should work without error.  I also agree that you shouldn't get deadlocks. It is my understanding though (and again, I'm a little fuzzy on this) that you can get some deadlock contention when there is insufficient ITL space in the blocks (I'm not sure what that is without more research, have just heard it), but I thought that was primarily around inserts and contention on blocks not having enough free space.  I wouldn't have thought deletes had a effect. To my knowledge, the locks in Oracle are row locks, so even without the 'for update' select statements, the locking should happen correctly.

Oracle is not doing table locks to my knowledge on deletes.  And all requests should be blocking wait requests unless something changes the session or an individual statement is run that way.

I just did a little research.... from the manual....
When a transaction runs in serializable mode, any attempt to change data that was changed by another transaction since the beginning of the serializable transaction causes an error:

ORA-08177: Can't serialize access for this transaction.
When you get this error, roll back the current transaction and execute it again. The transaction gets a new transaction snapshot, and the operation is likely to succeed.

To minimize the performance overhead of rolling back transactions and executing them again, try to put DML statements that might conflict with other concurrent transactions near the beginning of your transaction.

I suspect some other process may be attempting to update rows that you are trying to delete at the same time... meaning you started your transaction to delete, then somebody update a row that you are planning to delete, not the deletion fails. I don't think serializing your delete process is going to fix this, I think it has to do with the 'transaction isolation level'...

From the manual:
Choosing an Isolation Level for Transactions
Choose an isolation level that is appropriate to the specific application and workload. You might choose different isolation levels for different transactions. The choice depends on performance and consistency needs, and consideration of application coding requirements.

For environments with many concurrent users rapidly submitting transactions, you must assess transaction performance against the expected transaction arrival rate and response time demands, and choose an isolation level that provides the required degree of consistency while performing well. Frequently, for high performance environments, you must trade-off between consistency and concurrency (transaction throughput).

Both Oracle isolation modes provide high levels of consistency and concurrency (and performance) through the combination of row-level locking and Oracle's multi-version concurrency control system. Because readers and writers do not block one another in Oracle, while queries still see consistent data, both READ COMMITTED and SERIALIZABLE isolation provide a high level of concurrency for high performance, without the need for reading uncommitted ("dirty") data.

READ COMMITTED isolation can provide considerably more concurrency with a somewhat increased risk of inconsistent results (due to phantoms and non-repeatable reads) for some transactions. The SERIALIZABLE isolation level provides somewhat more consistency by protecting against phantoms and non-repeatable reads, and may be important where a read/write transaction executes a query more than once. However, SERIALIZABLE mode requires applications to check for the "can't serialize access" error, and can significantly reduce throughput in an environment with many concurrent transactions accessing the same data for update. Application logic that checks database consistency must take into account the fact that reads do not block writes in either mode.


I think you need to look at the application developers guide section on sql statement processing, which is where I found the above. In your case, you may need to use a different transaction isolation level...
http://download-west.oracle.com/docs/cd/B10501_01/appdev.920/a96590/adg08sql.htm#ADFNS007

I should have searched on that error to begin with...
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
dkrnicAuthor Commented:
Wow!

Thank you very much.

I'll close the thread now, but can you post on my mail box on which '06 conference
you're supposed to give those 2 hour lab exercise?
0
dkrnicAuthor Commented:
How do I close this thread?

There's only "Split Points" on my menu and there's nothing to split.
0
RCorfmanCommented:
Collaborate '06 in Tennesee in two weeks.  http://www.ioug.org ... http://www.ioug.org/collaborate06/index.cfm
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.