Julie Kurpa
asked on
Oracle Instance starts after server boot but then aborts and restarts
We have five databases running on an AIX 5.3 server. Let's call it our primary server.
The server is booted every week (don't ask why...).
The users have complained off and on about the database not being available when they expect it to be. They receive an ora-3113.
In looking at the alert.log of the instances, it shows that the instances shutdown normally upon server shutdown and then restart automatically after server restart. However, then it seems that the database instances (all five of them) shutdown (abort) and then restart again.
This server is a primary server in a dataguard environment for four of the databases. This problem occurs for all five so I don't think it's related to dataguard. The standby database is running four instances and although it is set up exactly as the primary, it does not encounter this issue after a server boot.
In the past, the only time I've seen the alert.log show a 'shutdown (abort)' like that is if the 'dbstart' is issued against an already running instance.
We have an entry in the /etc/oratab to start the listener and then execute the 'dbstart' script.
Is it possible that there is a setting somewhere that is re-issuing the 'dbstart' script? We can't figure it out. It seems this has been going on for some time now and we are only now realizing it because the user's complained.
Have compared /etc/inittabs and 'dbstart' scripts in both servers and they are identical.
Of course, we are assuming that I'm reading the alert log correctly. I will upload a snippet of our alert log during the boot process. ALERT-FOR-EE.log
The server is booted every week (don't ask why...).
The users have complained off and on about the database not being available when they expect it to be. They receive an ora-3113.
In looking at the alert.log of the instances, it shows that the instances shutdown normally upon server shutdown and then restart automatically after server restart. However, then it seems that the database instances (all five of them) shutdown (abort) and then restart again.
This server is a primary server in a dataguard environment for four of the databases. This problem occurs for all five so I don't think it's related to dataguard. The standby database is running four instances and although it is set up exactly as the primary, it does not encounter this issue after a server boot.
In the past, the only time I've seen the alert.log show a 'shutdown (abort)' like that is if the 'dbstart' is issued against an already running instance.
We have an entry in the /etc/oratab to start the listener and then execute the 'dbstart' script.
Is it possible that there is a setting somewhere that is re-issuing the 'dbstart' script? We can't figure it out. It seems this has been going on for some time now and we are only now realizing it because the user's complained.
Have compared /etc/inittabs and 'dbstart' scripts in both servers and they are identical.
Of course, we are assuming that I'm reading the alert log correctly. I will upload a snippet of our alert log during the boot process. ALERT-FOR-EE.log
>> Have compared /etc/inittabs <<
Did you compare /etc/rc.d/rc2.d and the possible link targets in /etc/rc.d/init.d as well?
wmp
Did you compare /etc/rc.d/rc2.d and the possible link targets in /etc/rc.d/init.d as well?
wmp
ASKER
oh..that rings a bell. I forgot about the rc2.d. Indeed there are links in the /ect/rc.d/rc2.d directory that are links to the /etc/dbora script.
I compared the /etc/dbora scripts between the primary and standby servers and they are identical.
lrwxrwxrwx 1 root system 10 Sep 21 2009 K01dbora -> /etc/dbora
lrwxrwxrwx 1 root system 10 Sep 21 2009 S99dbora -> /etc/dbora
I didn't see any entries in the /etc/rc.d/init.d for either server. would you like me to upload the /etc/dbora?
I compared the /etc/dbora scripts between the primary and standby servers and they are identical.
lrwxrwxrwx 1 root system 10 Sep 21 2009 K01dbora -> /etc/dbora
lrwxrwxrwx 1 root system 10 Sep 21 2009 S99dbora -> /etc/dbora
I didn't see any entries in the /etc/rc.d/init.d for either server. would you like me to upload the /etc/dbora?
ASKER
Oh. I just realized that the standby instances do the exact same thing. I didn't realize it because they had done the shutdown at a different time. I should wear my glasses.
so it seems the issue is occuring on both servers with all databases. Likely my restart scripts are set up wrong. I will upload the /etc/dbora S99dbora-ersrv.txt
so it seems the issue is occuring on both servers with all databases. Likely my restart scripts are set up wrong. I will upload the /etc/dbora S99dbora-ersrv.txt
Seems that this script will start the DB twice - once via remote shell (but to the local host) and once by local commands.
Is this what you desire?
Variables resolved the script first does:
rsh local_hostname /etc/dbora start ORA_DB
then
/u01/app/oracle/product/10 .2.0/db_1/ bin/dbstar t /u01/app/oracle/product/10 .2.0/db_1
The first step leads to /etc/dbora running a second time, this time only with
/u01/app/oracle/product/10 .2.0/db_1/ bin/dbstar t /u01/app/oracle/product/10 .2.0/db_1
(the first step will not run twice because of $2 being "ORA_DB" now)
I think the script is meant to start Oracle on a second host, so setting HOST to `hostname` (i.e. the local host) seems wrong.
Is this what you desire?
Variables resolved the script first does:
rsh local_hostname /etc/dbora start ORA_DB
then
/u01/app/oracle/product/10
The first step leads to /etc/dbora running a second time, this time only with
/u01/app/oracle/product/10
(the first step will not run twice because of $2 being "ORA_DB" now)
I think the script is meant to start Oracle on a second host, so setting HOST to `hostname` (i.e. the local host) seems wrong.
ASKER
I'm not too great at scripting. Can you tell me exactly what I can do to have the database start only once on the local server?
Well, sorry,
seems it's now up to me to clean the glasses.
I overlooked the "exit" statements following "rsh" resp. "remsh", so the script will indeed run twice, but it will start the DB only once.
So this script doesn't cause your issue.
Did you change the $ORACLE_HOME/bin/dbstart script, and if so, could you upload it?
wmp
seems it's now up to me to clean the glasses.
I overlooked the "exit" statements following "rsh" resp. "remsh", so the script will indeed run twice, but it will start the DB only once.
So this script doesn't cause your issue.
Did you change the $ORACLE_HOME/bin/dbstart script, and if so, could you upload it?
wmp
ASKER
Thanks. :)
Uploading the dbstart script as it has been executing on my server. Haven't changed anything.
Below is my /etc/inittab entries for the listener and dbstart:
oralsnr:2:once:/bin/su - oracle -c /u01/app/oracle/product/10 .2.0/db_1/ bin/lsnrct l start
oracle:2:wait:/bin/su - oracle -c /u01/app/oracle/product/10 .2.0/db_1/ bin/dbstar t
dbstart-ersrv.txt
Uploading the dbstart script as it has been executing on my server. Haven't changed anything.
Below is my /etc/inittab entries for the listener and dbstart:
oralsnr:2:once:/bin/su - oracle -c /u01/app/oracle/product/10
oracle:2:wait:/bin/su - oracle -c /u01/app/oracle/product/10
dbstart-ersrv.txt
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Yup... you are right. Too may "*tab" files. Ugh.
I'll remove the S99dbora from the rc2.d on my standby server and boot it . Will let you know the outcome.
I'll remove the S99dbora from the rc2.d on my standby server and boot it . Will let you know the outcome.
ASKER
All done. Something is amiss.
With the S99dbora removed, only the 1st instance in the /etc/oratab file started. I have four instances altogether. The other three never started. Looks like it only loops once through the /etc/oratab.
What do you think?
With the S99dbora removed, only the 1st instance in the /etc/oratab file started. I have four instances altogether. The other three never started. Looks like it only loops once through the /etc/oratab.
What do you think?
ASKER
Here's my /etc/oratab (with the SID names changed). Only INST1 started.
inst1:/u01/app/oracle/prod uct/10.2.0 /db_1:Y
inst2:/u01/app/oracle/prod uct/10.2.0 /db_1:Y
inst3:/u01/app/oracle/prod uct/10.2.0 /db_1:Y
inst4:/u01/app/oracle/prod uct/10.2.0 /db_1:Y
inst1:/u01/app/oracle/prod
inst2:/u01/app/oracle/prod
inst3:/u01/app/oracle/prod
inst4:/u01/app/oracle/prod
ASKER
oh...wait. It just appears to be very, very slow. Three of the databases are up. Waiting for the fourth.
ASKER
That worked beautifully! I think the delay was it trying to start the dbconsoles for each database. DBConsole doesn't work on the standby server since the instances only run in a MOUNT state. However the configuration is there so it's trying to start. Takes forever to fail. I will remove the configuration to see if that helps.
In the mean-time, all four databases are up and there was no 2nd attempt to start recorded in the alert log. Hooray!
For my clarification:
Regardless of what is in the /etc/inittab, if there's something in the rc2.d directory, it will run it after the server is finished booting? Is that correct?
In the mean-time, all four databases are up and there was no 2nd attempt to start recorded in the alert log. Hooray!
For my clarification:
Regardless of what is in the /etc/inittab, if there's something in the rc2.d directory, it will run it after the server is finished booting? Is that correct?
Yes, correct,
because there are entries in inittab for each runlevel which would start an rc script, which in turn would scan the rcn.d (according to runlevel n) directories for Sxxxxx entries and would run these entries with parameter "start".
Look at /etc/inittab, you will find these entries easily!
Similarly, at shutdown the Kxxxx entries are run with parameter "stop".
wmp
because there are entries in inittab for each runlevel which would start an rc script, which in turn would scan the rcn.d (according to runlevel n) directories for Sxxxxx entries and would run these entries with parameter "start".
Look at /etc/inittab, you will find these entries easily!
Similarly, at shutdown the Kxxxx entries are run with parameter "stop".
wmp
ASKER
Awesome!
ASKER