Link to home
Start Free TrialLog in
Avatar of joaotelles
joaotellesFlag for United States of America

asked on

Oracle 10 error - ORA-09817: Write to audit file failed.

Hi,

I was getting an error in Oracle that I think is because I have a arch partition full...

oracle@E0100daSimLab01> df -h
/dev/vx/dsk/dba_dg/dba_admin_vol

                        50G    50G     0K   100%    /opt/oracle/admin

b) oracle@E0100daSimLab01> cd /opt/oracle/admin
oracle@E0100daSimLab01> du -sh *
  36G   DPDB            <------where the arch files are
   0K   flash_recovery_area
   0K   lost+found

I have moved some arch files to other disk but still the space is not freed.... as you can see there is a difference that I cant explain between the result of the du and the df command...

Any suggestion?

This is the error I get when I try to log as sysdba.

SQL> connect / as sysdba
ERROR:
ORA-09817: Write to audit file failed.
SVR4 Error: 28: No space left on device
ORA-09945: Unable to initialize the audit trail file
SVR4 Error: 28: No space left on device


OS: solaris 10
SOLUTION
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of joaotelles

ASKER

The flash_recovery_area is in the same partition as the Archive (dont ask me why).. So it full...

The problem is why, even if I move the archive files, the space is not freed..

/dev/vx/dsk/dba_dg/dba_admin_vol

                        50G    50G     0K   100%    /opt/oracle/admin

oracle@E0100daSimLab01> cd /opt/oracle/admin
oracle@E0100daSimLab01> du -sh *
  36G   DPDB            <------where the arch files are
   0K   flash_recovery_area
   0K   lost+found
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
RMAN> connect target /

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
ORA-01090: shutdown in progress - connection is not permitted
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Are you shutting down?

What is in your alert log?  Please post the last 30 or 40 lines?
Here is what I found in the logs:

ARC0: Closing local archive destination LOG_ARCHIVE_DEST_1: '/opt/oracle/admin/DPDB/arch/arch
_1_128_699740096.arc' (error 19502)
 (DPDB)
ARC0: Failed to archive thread 1 sequence 128 (19502)

Last lines attached.
log-DB-.txt
>>SVR4 Error: 28: No space left on device
>>ARCH: Archival stopped, error occurred. Will continue retrying

Your archivelog log destinition is full.

When you issued the connect target / were you on the database server itself and had the ORACLE_SID set properly?
Yes... I was in the database server logged as oracle user.

The SID is correct DPDB.

The thing is that the archival destination is not full anymore...

but for the df -h it is...

/dev/vx/dsk/dba_dg/dba_admin_vol
                        50G    50G     0K   100%    /opt/oracle/admin

oracle@E0100daSimLab01> du -sh *
  36G   DPDB            <------where the arch files are
   0K   flash_recovery_area
   0K   lost+found


Also the last one I have in the directory of the arch diretory is the arch_126.. and in the laert logs it mention the 127,128,129 although they are not there...

-rw-r-----   1 oracle   dba      510899712 Jan 17 22:00 arch_1_113_699740096.arc
-rw-r-----   1 oracle   dba      511232000 Jan 25 22:00 arch_1_114_699740096.arc
-rw-r-----   1 oracle   dba      510768128 Feb  3 00:01 arch_1_115_699740096.arc
-rw-r-----   1 oracle   dba      510772224 Feb 13 11:10 arch_1_116_699740096.arc
-rw-r-----   1 oracle   dba      510772736 Feb 20 22:00 arch_1_117_699740096.arc
-rw-r-----   1 oracle   dba      510765568 Feb 28 20:00 arch_1_118_699740096.arc
-rw-r-----   1 oracle   dba      510772224 Mar  7 22:00 arch_1_119_699740096.arc
-rw-r-----   1 oracle   dba      510909952 Mar 15 00:05 arch_1_120_699740096.arc
-rw-r-----   1 oracle   dba      510772736 Mar 22 01:00 arch_1_121_699740096.arc
-rw-r-----   1 oracle   dba      510771712 Mar 29 02:16 arch_1_122_699740096.arc
-rw-r-----   1 oracle   dba      510772736 Apr  5 00:22 arch_1_123_699740096.arc
-rw-r-----   1 oracle   dba      510772736 May 24 09:06 arch_1_125_699740096.arc
-rw-r-----   1 oracle   dba      510772736 May 24 09:06 arch_1_124_699740096.arc
-rw-r-----   1 oracle   dba      510764544 May 24 09:06 arch_1_126_699740096.arc
>> 0K   flash_recovery_area

archived redo goes here not the same place as 'archive' logs.  Different animals.

Unless you have it set up to place archied redo logs in a different folder.
Again:  Just because there is disk space in the folder does not mean Oracle thinkts there is.  Space is controled internally for archive destination size and space used.

You need to clean it up from RMAN.

Since you cannot connect to RMAN, did anyone issue a shutdown?

I'll see if I can find anything on a shutdown in progress when connecting to RMAN.
Dont know if anyone issue it....

Tks for the help.... let me know if there any other info I can provide...

But shouldnt in df -h the space appear? Looks like there are hidden files... dont know..
>>36G   DPDB            <------where the arch files are

Ignore my previous comment.  I see form the alert log you posteed it is trying to write here: /opt/oracle/admin/DPDB/arch

You need to get into RMAN and do the crosscheck or increase the size of the database parameter: db_recovery_file_dest_size
>>But shouldnt in df -h the space appear?

The space is there.  Oracle doesn't think it is.  It tracks available space in db_recovery_file_dest_size internally.

I'm looking for errors in connecting to RMAN with a full archive dest.  Not finding much.

I would increase db_recovery_file_dest_size temporarily to a larger size, let Oracle catch up on archiving, then use RMAN crosscheck/delete to get everything back in sync then reduce the db_recovery_file_dest_size if necessary.
Isnt there a way to kill the shutdown in progress process to then connect to the RMAN and do the crosscheck?
I have this here:

    Stop the listener - you can at least prevent others from connecting to the instance that you're trying to shut down:  bash-3.00$ lsnrctl stop
    Get a list of processes hitting the database.  You'll need that before you start killing them off one by one:  bash-3.00$ ps -eaf|grep LOCAL
    The output from step two will have the processid in the second column.  Jot a few down and then start to
    Kill the processes.  Take the second column's output from step #2 and execute the following:  bash-3.00$ kill -9 396 - where "396" is the processid you wish to kill.
    Enter into SQLPlus and connect as SYSDBA:  SQL> conn /as sysdba - you'll receive "Connected to an idle instance"
    Shutdown the database via SQL> shutdown immediate You may be given the following:

ORA-24324: service handle not initialized
ORA-24323: value not allowed
ORA-01090: shutdown in progress - connection is not permitted

    7. Reconnect to SQLPlus via sysdba

    8. Execute a startup force command.

    9. Don't forget to start the listener back up
I would never suggest killing Oracle processes from the command prompt unless you know which one.

Once a shutdown is in progress, I do not believe killing anything will 'stop' it.

The alert logs should show there was a shutdown command issued.  If there wasn't, then I would not suggest killing anything.

Increase db_recovery_file_dest_size, let Oracle archive a couple redo logs, then see if you can connect using RMAN.
Increase db_recovery_file_dest_size you mean change the pfile?
>>Increase db_recovery_file_dest_size you mean change the pfile?

Yes, the init parameter.  Are you using a pfile or spfile?
This value at the init.ora is:
db_recovery_file_dest_size=2147483648

oracle@E0100daSimLab01> ls
initDPDB.ora    spfileDPDB.ora

I believe it using spfile.

Should I just edit it?
>>Should I just edit it?

NO.  spfile is a binary file.  You need to use ALTER SYSTEM commands.  If you edit it, you will corrupt it.

>>2147483648

If I did my math right, this is 2 gig?

Try:
alter system set db_recovery_file_dest_size=2304M;

or larger since you seem to have a LOT of disk space.
Here is what I get.

oracle@E0100b01> sqlplus /nolog

SQL*Plus: Release 10.2.0.1.0 - Production on Mon Jun 11 16:06:52 2012

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

SQL> connect / as sysdba
ERROR:
ORA-09925: Unable to create audit trail file
SVR4 Error: 28: No space left on device
Additional information: 9925
ORA-09925: Unable to create audit trail file
SVR4 Error: 28: No space left on device
Additional information: 9925


SQL> alter system set db_recovery_file_dest_size=2504M;
SP2-0640: Not connected


SQL> connect system/manager
ERROR:
ORA-01090: shutdown in progress - connection is not permitted
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Check out:
Ora-09817: Write To Audit File Failed. [ID 759486.1]

Post the results of:
echo $ORACLE_HOME
df -k
Delete the audit did not help...

sync did not help.

> echo $ORACLE_HOME
/opt/oracle/product/10.2.0/Db_1

> df -k
Filesystem            kbytes    used   avail capacity  Mounted on
/dev/md/dsk/d0       24788639 2262104 22278649    10%    /
/devices                   0       0       0     0%    /devices
ctfs                       0       0       0     0%    /system/contract
proc                       0       0       0     0%    /proc
mnttab                     0       0       0     0%    /etc/mnttab
swap                 14094072    1576 14092496     1%    /etc/svc/volatile
objfs                      0       0       0     0%    /system/object
sharefs                    0       0       0     0%    /etc/dfs/sharetab
/dev/md/dsk/d4       10327132 3872306 6351555    38%    /usr
/platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap1.so.1
                     24788639 2262104 22278649    10%    /platform/sun4u-us3/lib/libc_psr.so.1
/platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1
                     24788639 2262104 22278649    10%    /platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
fd                         0       0       0     0%    /dev/fd
/dev/md/dsk/d3       10327132 3783274 6440587    38%    /var
swap                 14092944     448 14092496     1%    /tmp
swap                 14092552      56 14092496     1%    /var/run
swap                 14092496       0 14092496     0%    /dev/vx/dmp
swap                 14092496       0 14092496     0%    /dev/vx/rdmp
/dev/md/dsk/d5       8262869 4899828 3280413    60%    /export/home
/dev/vx/dsk/dpsfw1_dg/dpsfw1_vol
                     52386816 15783343 34316369    32%    /opt/
/dev/vx/dsk/oraclesfw1_dg/oraclesfw1_vol
                     52386816 3232610 46083373     7%    /opt/oracle/product
/dev/vx/dsk/dba_dg/dba_admin_vol
                     52385792 52385792       0   100%    /opt/oracle/admin
/dev/vx/dsk/dba_dg/dba_data_vol
                     104814608 76414593 26626035    75%    /opt/oracle/data

Last lines in alert log:
Tue Jun 12 08:43:06 2012
ARC0: Failed to archive thread 1 sequence 1ARC0: Failed to archive thread 1 sequence 128 (19504)
Tue Jun 12 08:43:16 2012
ARC0: Closing local archive destination LOG_ARCHIVE_DEST_1: '/opt/oracle/admin/DPDB/arch/arch_1_128_699740096.arc' (error 19502)
 (DPDB)
Tue Jun 12 08:43:16 2012
ARC1: Closing local archive destination LOG_ARCHIVE_DEST_1: '/opt/oracle/admin/DPDB/arch/arch_1_127_699740096.arc' (error 19502)
 (DPDB)
/opt/oracle/admin is 100% full again.

As soon as you free up any space by deleting audit files, the redo archiver is likely taking it back.
I was hoping that an audit file would be small enough that the archiver wouldn't immediately take the space.

How far back do the archives that are there go?  How often are they backed up?
Thats the thing... I dont thing the backup is running as it should...

The archive I have in the folder is the 126, so it at least 2 files behind... but the files 127 and 128 are not showing at the arch folder.... shouldnt it be there when its created?

The main problem here is the inconsistency between du and df.... because for du, I have space available in the partition, therefore enough space for new arch files...
As soon as you free up any space by deleting audit files, the redo archiver is likely taking it back - I tried to free space but its impossible... I moved around 40 arch files and the space stays 100% taken...

oracle@E0100daSimLab01> df -h
/dev/vx/dsk/dba_dg/dba_admin_vol

                        50G    50G     0K   100%    /opt/oracle/admin

b) oracle@E0100daSimLab01> cd /opt/oracle/admin
oracle@E0100daSimLab01> du -sh *
  36G   DPDB            <------where the arch files are
   0K   flash_recovery_area
   0K   lost+found

Acording to this I hould have 14G free but its not freeing up and I dont knwo the reason...
My apologies.  My Unix is really old.

du shows usage not free:
http://www.computerhope.com/unix/udu.htm

About du

Tells you how much space a file occupies.

Examples

du -s *.txt - Would report the size of each txt file in the current directory. Below is an example of the output.
Well... maybe the issue is in the OS somehow...

Because I have 50G in the partition..

And using du I can see that Im using only 36G...

If I move ALL the arch files... it does not fre any space whatsoever...
>>And using du I can see that Im using only 36G...

In the DPDB folder under /opt/oracle/admin.

Other files under the mount point has to be using the rest since df shows 100% capacity.

post the results of: ls -al /opt/oracle/admin


>>If I move ALL the arch files... it does not fre any space whatsoever...

After you remove some .arc files, check the alert log and see if you generated another archived redo log.  Since Oracle is hung waiting to archive redo logs, any free space will likely be grabbed up pretty quick.
The bigger question is where are you moving archive files to?

Some of the difference in free space versus used has to do with file system over head.  There is also a reserved space that users cannot use.  root can use it.  It is a percentage of the file system.  It has been a long time, but I believe the default is 20%, which goes back to the old days of small file systems.  On our database file systems we cut that down to 5% or less.
Im not moving it to the root.... Im them moving to a partition that is not related to Oracle....

But no matter to where I move, it does not clear ANY space

oracle@E0100daSimLab01> df -h
/dev/vx/dsk/dba_dg/dba_admin_vol

                        50G    50G     0K   100%    /opt/oracle/admin

b) oracle@E0100daSimLab01> cd /opt/oracle/admin
oracle@E0100daSimLab01> du -sh *
  36G   DPDB            <------where the arch files are
   0K   flash_recovery_area
   0K   lost+found
re http:#a38074278

After you remove some .arc files, check the alert log and see if you generated another archived redo log.


re: http:#a38074278

post the results of: ls -al /opt/oracle/admin
tks