asked on

Oracle Instance starts after server boot but then aborts and restarts

We have five databases running on an AIX 5.3 server. Let's call it our primary server.
The server is booted every week (don't ask why...).

The users have complained off and on about the database not being available when they expect it to be. They receive an ora-3113.

In looking at the alert.log of the instances, it shows that the instances shutdown normally upon server shutdown and then restart automatically after server restart. However, then it seems that the database instances (all five of them) shutdown (abort) and then restart again.

This server is a primary server in a dataguard environment for four of the databases. This problem occurs for all five so I don't think it's related to dataguard. The standby database is running four instances and although it is set up exactly as the primary, it does not encounter this issue after a server boot.

In the past, the only time I've seen the alert.log show a 'shutdown (abort)' like that is if the 'dbstart' is issued against an already running instance.

We have an entry in the /etc/oratab to start the listener and then execute the 'dbstart' script.
Is it possible that there is a setting somewhere that is re-issuing the 'dbstart' script? We can't figure it out. It seems this has been going on for some time now and we are only now realizing it because the user's complained.

Have compared /etc/inittabs and 'dbstart' scripts in both servers and they are identical.

Of course, we are assuming that I'm reading the alert log correctly. I will upload a snippet of our alert log during the boot process. ALERT-FOR-EE.log

Julie Kurpa

ASKER

To assist you: the 2nd shutdown (abort) occurs at 00:27:58

woolmilkporc

>> Have compared /etc/inittabs <<

Did you compare /etc/rc.d/rc2.d and the possible link targets in /etc/rc.d/init.d as well?

wmp

Julie Kurpa

ASKER

oh..that rings a bell. I forgot about the rc2.d. Indeed there are links in the /ect/rc.d/rc2.d directory that are links to the /etc/dbora script.
I compared the /etc/dbora scripts between the primary and standby servers and they are identical.
lrwxrwxrwx 1 root system 10 Sep 21 2009 K01dbora -> /etc/dbora
lrwxrwxrwx 1 root system 10 Sep 21 2009 S99dbora -> /etc/dbora

I didn't see any entries in the /etc/rc.d/init.d for either server. would you like me to upload the /etc/dbora?

Julie Kurpa

ASKER

Oh. I just realized that the standby instances do the exact same thing. I didn't realize it because they had done the shutdown at a different time. I should wear my glasses.

so it seems the issue is occuring on both servers with all databases. Likely my restart scripts are set up wrong. I will upload the /etc/dbora S99dbora-ersrv.txt

woolmilkporc

Seems that this script will start the DB twice - once via remote shell (but to the local host) and once by local commands.

Is this what you desire?

Variables resolved the script first does:

rsh local_hostname /etc/dbora start ORA_DB

then

/u01/app/oracle/product/10.2.0/db_1/bin/dbstart /u01/app/oracle/product/10.2.0/db_1

The first step leads to /etc/dbora running a second time, this time only with

/u01/app/oracle/product/10.2.0/db_1/bin/dbstart /u01/app/oracle/product/10.2.0/db_1

(the first step will not run twice because of $2 being "ORA_DB" now)

I think the script is meant to start Oracle on a second host, so setting HOST to `hostname` (i.e. the local host) seems wrong.

Julie Kurpa

ASKER

I'm not too great at scripting. Can you tell me exactly what I can do to have the database start only once on the local server?

woolmilkporc

Well, sorry,

seems it's now up to me to clean the glasses.

I overlooked the "exit" statements following "rsh" resp. "remsh", so the script will indeed run twice, but it will start the DB only once.

So this script doesn't cause your issue.

Did you change the $ORACLE_HOME/bin/dbstart script, and if so, could you upload it?

wmp

Julie Kurpa

ASKER

Thanks. :)
Uploading the dbstart script as it has been executing on my server. Haven't changed anything.

Below is my /etc/inittab entries for the listener and dbstart:
oralsnr:2:once:/bin/su - oracle -c /u01/app/oracle/product/10.2.0/db_1/bin/lsnrctl start
oracle:2:wait:/bin/su - oracle -c /u01/app/oracle/product/10.2.0/db_1/bin/dbstart

dbstart-ersrv.txt

ASKER CERTIFIED SOLUTION

woolmilkporc

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Julie Kurpa

ASKER

Yup... you are right. Too may "*tab" files. Ugh.
I'll remove the S99dbora from the rc2.d on my standby server and boot it . Will let you know the outcome.

Julie Kurpa

ASKER

All done. Something is amiss.
With the S99dbora removed, only the 1st instance in the /etc/oratab file started. I have four instances altogether. The other three never started. Looks like it only loops once through the /etc/oratab.

What do you think?

Julie Kurpa

ASKER

Here's my /etc/oratab (with the SID names changed). Only INST1 started.

inst1:/u01/app/oracle/product/10.2.0/db_1:Y
inst2:/u01/app/oracle/product/10.2.0/db_1:Y
inst3:/u01/app/oracle/product/10.2.0/db_1:Y
inst4:/u01/app/oracle/product/10.2.0/db_1:Y

Julie Kurpa

ASKER

oh...wait. It just appears to be very, very slow. Three of the databases are up. Waiting for the fourth.

Julie Kurpa

ASKER

That worked beautifully! I think the delay was it trying to start the dbconsoles for each database. DBConsole doesn't work on the standby server since the instances only run in a MOUNT state. However the configuration is there so it's trying to start. Takes forever to fail. I will remove the configuration to see if that helps.

In the mean-time, all four databases are up and there was no 2nd attempt to start recorded in the alert log. Hooray!

For my clarification:
Regardless of what is in the /etc/inittab, if there's something in the rc2.d directory, it will run it after the server is finished booting? Is that correct?

woolmilkporc

Yes, correct,

because there are entries in inittab for each runlevel which would start an rc script, which in turn would scan the rcn.d (according to runlevel n) directories for Sxxxxx entries and would run these entries with parameter "start".
Look at /etc/inittab, you will find these entries easily!

Similarly, at shutdown the Kxxxx entries are run with parameter "stop".

wmp

Julie Kurpa

ASKER

Awesome!