Link to home
Start Free TrialLog in
Avatar of tmike2005
tmike2005

asked on

Shutdown immediate hangs when used in backup scripts

Occasionally our weekly cold backup script fails due to receiving a 'shutdown: waiting for active calls to complete' message in the database's alert log (this situation occurs across a number of databases at various times, never any one specific database). I've researched this issue here, at Metalink and around the internet in general but have not seen a definative answer to the problem.  Ask Tom website gives as resolution of "do not perform cold backups" which does not seem like an appropriate answer. I've heard that an approach to this problem might be to add code into the existing backup script to locate and 'kill' any client connections to the database(s), specifically looking for SID processes that contain (LOCAL=NO). Is there a better approach to this problem (other than scripting) and if not, is there anyone here that might have a script (or portions of their backup script) that would enable such process checking/termination as part of their current weekly cold backup script?  Our platform is Solaris 2.6/2.8 and Oracle version 8.1.7.4/9.2.0.4-6 and 10.1.0.4 (note: no such shutdown issues have been noted in 10.1.0.4). This problem has been a long time issue, it would be great to finally work around it (or even better, resolve it). Thanks for your assistance.
Avatar of jwittenm
jwittenm
Flag of United States of America image

In the past, on Oracle 8i and 9i systems, we have either written the types of scripts that you are talking about, or done the following:
Shutdown abort
startup restrict
shutdown immediate
I think that "Shutdown abort" can be very dangerous because any current transaction are killed.It's much better to identify currently running user sessions and kill them when possible if done manually.
For authomatic handling the best option is to define a default profile and limit its connect and definetly idle time.
Also it's possible to perform a forcible log switch before a shutdown to resolve any archiving issues if exist.
Hope it helps
Avatar of schwertner
Use
SHUTDOWN IMMEDIATE
You can even use dbshut.sql script via dbora, passing the needed parameter there.

SHUTDOWN IMMEDIATE works. Obviously you use SHUTDOWN and this is the cause to wait for closing of opened connections.

SHUTDOWN ABORT is not recommended for normal use.
Shutdown immediate waits for current transactions to commit or rollback.
Shutdown abort is no more dangerous than a hot backup.  People used to be afraid of those, too.
Avatar of tmike2005
tmike2005

ASKER

Appreciate the responses so far -- I was hoping this issue had been a problem for someone here at one point or another and if so, what was their approach to the problem (i.e., was there an actual resolution whether it was a script written to seek out and 'kill', a specific setting in sqlnet, etc.).

It's appearing that the option might be to simply perform hot backups for our weekly saves (is what Oracle's stand is beyond usnig shutdown abort, startup restrict, shutdown in sequence. This is fine but I'd personally rather have at least one cold backup each week to have a fall back on any particular db that was in a solid state of consistency versus potentially loosing transactions at the point of the shutdown abort operation. Log switching is fine to lessen the potential data loss but if there are active transactions going on at the time of the actual shutdown abort, then what?  

Oleggold -- your suggestion regarding defining a default profile and managing connection timeouts would help if the db server was strictly client-server architecture. However, many of the databases (nearly all) are accessed through web applications. As you might suspect, web applications are not all that easy to control when it comes to connection management.

I am going to apply sqlnet_timeout.expire=1 to sqlnet.ora  to see if that will help take out any potential dead connection. Anybody else have a comment regarding this potential angle of attack to the problem?

tmike
ASKER CERTIFIED SOLUTION
Avatar of jwittenm
jwittenm
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Whew.  with all that said, they are just suggestions.  i have no vested interest :)
you have to decide what is right for you.
It's hard to debate the issue of only using hot backups versus cold -- jwittenm's point (along with his explanation) is well taken. However, I am in a situation (work) that requires once a week cold backups as part of corporate procedure. There are many servers all running Solaris 2.6, some with 2.8 and all with 8.1.7.4.  The hung database issues always seem to appear on one specific server which makes one wonder what the differences between the hung server versus all the others. A close inspection of each shows they are setup identical other than small variations in shared memory settings to accomodate the db count, SGA sizes, etc. The idea of trying to terminate dead connections as I mentioned earlier is simply an untried option. I implemented the sqlnet_expire.time to see how or of that would have any impact on the situation when bringing the databases down each Sunday evening for the cold backups. Although too early to say, this past weekend all databases on the suspect  server shutdown nicely. More investigation has to occur regarding what was happening at the time but it was the first time these databases shutdown cleanly in many weeks.

It seems to me that other than a long batch job happening at the time of the shutdown immediate request, that the database should come down fairly quickly (as has been the normal experience) versus taking hours to shutdown (and in our case, going beyond the time when the databases need to be back up some 4 plus hours later on early Monday morning). And the hung condition occurs randomly amongst the databases on the server -- and most of them are either not active at all during the weekend or minimal activity.

Unless there is somebody else out here that has a different opinion on how best to handle this problem, it seems that the only answer is to avoid cold backups (on at least this suspect server) and opt for hot backups. Corporate needs of a cold backup aside, if the only alternative is to wait each week to see if there is a database still in a state of shutdown which causes more pain than it's worth (loss of backing up the database to off-line storage, customers not being able to access their applications which use the database, etc.) then all I can do is to present those alternatives. Obviously I was hoping to find a work-around to handle the cold backups and avoid the possible issue with the database taking forever (so it seems) to shutdown (without abort).

Anyone care to comment further?  I appreciate any and all feedback.

 
Oracle software is as every other software. It is not perfect.
If you are able and have resorces to do this, reinstall (Export, reinstall, Import) the database.
You understand - sometimes we clean our houses.
This helps, especially if you upgrade the software version.