Link to home
Start Free TrialLog in
Avatar of Julie Kurpa
Julie KurpaFlag for United States of America

asked on

Oracle Instance starts after server boot but then aborts and restarts

We have five databases running on an AIX 5.3 server.  Let's call it our primary server.
The server is booted every week (don't ask why...).

The users have complained off and on about the database not being available when they expect it to be.  They receive an ora-3113.

In looking at the alert.log of the instances, it shows that the instances shutdown normally upon server shutdown and then restart automatically after server restart.  However, then it seems that the database instances (all five of them) shutdown (abort) and then restart again.  

This server is a primary server in a dataguard environment for four of the databases.  This problem occurs for all five so I don't think it's related to dataguard.  The standby database is running four instances and although it is set up exactly as the primary, it does not encounter this issue after a server boot.

In the past, the only time I've seen the alert.log show a 'shutdown (abort)' like that is if the 'dbstart' is issued against an already running instance.  

We have an entry in the /etc/oratab to start the listener and then execute the 'dbstart' script.  
Is it possible that there is a setting somewhere that is re-issuing the 'dbstart' script?  We can't figure it out.  It seems this has been going on for some time now and we are only now realizing it because the user's complained.

Have compared /etc/inittabs and 'dbstart' scripts in both servers and they are identical.  

Of course, we are assuming that I'm reading the alert log correctly.    I will upload a snippet of our alert log during the boot process.      ALERT-FOR-EE.log
Avatar of Julie Kurpa
Julie Kurpa
Flag of United States of America image

ASKER

To assist you:   the 2nd shutdown (abort) occurs at 00:27:58
Avatar of woolmilkporc
>> Have compared /etc/inittabs  <<

Did you compare /etc/rc.d/rc2.d and the possible link targets in /etc/rc.d/init.d as well?

wmp
oh..that rings a bell.  I forgot about the rc2.d.  Indeed there are links in the /ect/rc.d/rc2.d directory that are links to the /etc/dbora script.  
I compared the /etc/dbora scripts between the primary and standby servers and they are identical.  
lrwxrwxrwx   1 root     system           10 Sep 21 2009  K01dbora -> /etc/dbora
lrwxrwxrwx   1 root     system           10 Sep 21 2009  S99dbora -> /etc/dbora

I didn't see any entries in the /etc/rc.d/init.d for either server.  would you like me to upload the /etc/dbora?
Oh.  I just realized that the standby instances do the exact same thing.  I didn't realize it because they had done the shutdown at a different time.  I should wear my glasses.  

so it seems the issue is occuring on both servers with all databases.   Likely my restart scripts are set up wrong.  I will upload the /etc/dbora S99dbora-ersrv.txt
Seems that this script will start the DB twice - once via remote shell (but to the local host) and once by local commands.

Is this what you desire?

Variables resolved the script first does:

rsh local_hostname /etc/dbora start ORA_DB

then

/u01/app/oracle/product/10.2.0/db_1/bin/dbstart /u01/app/oracle/product/10.2.0/db_1

The first step leads to /etc/dbora running a second time, this time only with

/u01/app/oracle/product/10.2.0/db_1/bin/dbstart /u01/app/oracle/product/10.2.0/db_1

(the first step will not run twice because of $2 being "ORA_DB" now)

I think the script is meant to start Oracle on a second host, so setting HOST to `hostname` (i.e. the local host) seems wrong.
I'm not too great at scripting.  Can you tell me exactly what I can do to have the database start only once on the local server?
Well, sorry,

seems it's now up to me to clean the glasses.

I overlooked the "exit" statements following "rsh" resp. "remsh", so the script will indeed run twice, but it will start the DB only once.

So this script doesn't cause your issue.

Did you change the $ORACLE_HOME/bin/dbstart script, and if so, could you upload it?

wmp
Thanks. :)
Uploading the dbstart script as it has been executing on my server.  Haven't changed anything.    

Below is my /etc/inittab entries for the listener and dbstart:
oralsnr:2:once:/bin/su - oracle -c /u01/app/oracle/product/10.2.0/db_1/bin/lsnrctl start
oracle:2:wait:/bin/su - oracle -c /u01/app/oracle/product/10.2.0/db_1/bin/dbstart

 dbstart-ersrv.txt
ASKER CERTIFIED SOLUTION
Avatar of woolmilkporc
woolmilkporc
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Yup... you are right.  Too may "*tab" files.  Ugh.
I'll remove the S99dbora from the rc2.d on my standby server and boot it .  Will let you know the outcome.  
All done.  Something is amiss.
With the S99dbora removed, only the 1st instance in the /etc/oratab file started. I have four instances altogether.  The other three never started.  Looks like it only loops once through the /etc/oratab.

What do you think?
Here's my /etc/oratab (with the SID names changed).  Only INST1 started.

inst1:/u01/app/oracle/product/10.2.0/db_1:Y
inst2:/u01/app/oracle/product/10.2.0/db_1:Y
inst3:/u01/app/oracle/product/10.2.0/db_1:Y
inst4:/u01/app/oracle/product/10.2.0/db_1:Y
oh...wait.  It just appears to be very, very slow.  Three of the databases are up.  Waiting for the fourth.
That worked beautifully!  I think the delay was it trying to start the dbconsoles for each database.   DBConsole doesn't work on the standby server since the instances only run in a MOUNT state.  However the configuration is there so it's trying to start.  Takes forever to fail.   I will remove the configuration to see if that helps.

In the mean-time, all four databases are up and there was no 2nd attempt to start recorded in the alert log.   Hooray!

For my clarification:
Regardless of what is in the /etc/inittab, if there's something in the rc2.d directory, it will run it after the server is finished booting?  Is that correct?
Yes, correct,

because there are entries in inittab for each runlevel which would start an rc script, which in turn would scan the rcn.d (according to runlevel n) directories for Sxxxxx entries and would run these entries with parameter "start".
Look at /etc/inittab, you will find these entries easily!

Similarly, at shutdown the Kxxxx entries are run with parameter "stop".

wmp
Awesome!