Solved

Files locked, cant open DB

Posted on 2004-08-04
15
2,142 Views
Last Modified: 2008-03-06

Seems that the Solaris OS or maybe even an NFS mount could be involved in now wanting to release locks on
the db files...strange. I can mount the db but cannot open the database due to these locks.
 
It's release 8.1.7 and here is the log:
 
***********************************alert log*******************************
Errors in file /u01/app/oracle/admin/ignp/bdump/ignp_dbw0_1371.trc:
ORA-01157: cannot identify/lock data file 1 - see DBWR trace file
ORA-01110: data file 1: '/u01/oradata/ignp/system.01.dbf'
ORA-27086: skgfglk: unable to lock file - already in use
SVR4 Error: 11: Resource temporarily unavailable
Additional information: 8
 
Tue Aug  3 22:01:00 2004
Errors in file /u01/app/oracle/admin/ignp/udump/ignp_ora_349.trc:
ORA-00600: internal error code, arguments: [2103], [0], [0], [1], [900], [], [], []
******************************************************************************
 
 
Also what is this file for in .....8.1.7/dbs/lk<oracle_sid>   I guess it is used for locking....says do not delete
I wonder if I deleted it when the system was down if it would clear anything....probably not if its the OS that won't
let go of the file. How do I tell what has the file locked?
Have not found anything useful on the internet.
Thanks for any ideas you may have.
Regards,
Mike
0
Comment
Question by:mmoore
  • 4
  • 3
  • 3
  • +4
15 Comments
 
LVL 7

Accepted Solution

by:
BobMc earned 42 total points
Comment Utility
Do you have all the relevant OS patches installed?

The ony time I have ever come across this type of problem is when a backup has been running in the background, or a backup process has stuck.

I think the lk<sid> file acts as a sort of semaphore, to prevent you from (attempting to) opening the database twice. If you shutdown cleanly, I would have thought it would disappear.

I think if you can guarantee that you dont have an instance running and/or background processes, shared memeory / semaphores etc, then you can delete it.

Theres always the alternate way - try a reboot (I think Im spending too long on Microsoft platforms!)

HTH
Bob
0
 
LVL 23

Assisted Solution

by:seazodiac
seazodiac earned 42 total points
Comment Utility
@mmoore:

Mostly likely because you want to start up an already up and running instance.

in solaris:

try

$ ps -efa | grep ora


to see if there are any existing Oracle processes running.

if they are, shutdown immediate and then restart. or do nothing..
0
 
LVL 29

Assisted Solution

by:MikeOM_DBA
MikeOM_DBA earned 42 total points
Comment Utility
Also:

Check that the $ORACLE_SID is correct and corresponds to the pfile (initSID.ora) you are using to start the database.

0
 
LVL 29

Expert Comment

by:MikeOM_DBA
Comment Utility

PS: Ths error happens when database is already up!
0
 
LVL 34

Assisted Solution

by:johnsone
johnsone earned 42 total points
Comment Utility
The lk file can be safely removed if and only if you verify that the database is down first.  Although I have not seen the issue in 8.1, if previous versions did not shut down cleanly this could happen.

I do not think the lk file is your problem, as the message would me more along the lines of cannot start already running database, not the message you are seeing.

The lsof tool is very useful in determining who has open files.  If you do not have it, I believe you can download from Sun.

You can also shut down the database cleanly and kill any oracle processes.  This may work, as there may have been an orphan process that still had a file open.
0
 
LVL 3

Author Comment

by:mmoore
Comment Utility
Thanks for the ideas. I will look into it and get back to y'all soon.
0
 
LVL 47

Assisted Solution

by:schwertner
schwertner earned 42 total points
Comment Utility
ORA-00600:  is a clear sign for Oracle bug.
Shutdown the OS and restart.
If you succeed to start the DB then make a full backup and full export of the DB.
Download the newest patch or even consider migration to 9.2.0.5.
Apply the patch.
Also do not allow installation of any other software on the server machine.
In Unix/Linux and even in Windows this is the straight way to corruptions
(new software - changes in the C libraries ---> new not working versions of functions, etc.)
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 3

Author Comment

by:mmoore
Comment Utility
yes, we have rebooted several times. At this point I'm sure no other process is holding the files in question, as we have
some commands that would show us that. Also no semaphores or shared memory is holding the files.
Oracle is just confused and not able to cleanup after itself.
 
Its oracle 8.1.7.0.0 on sun os 5.8
0
 
LVL 23

Expert Comment

by:seazodiac
Comment Utility
mmoore:

can you do this and tell us the output?

$ env | grep ORA

and show us the connection string when you are trying to start the database?
0
 
LVL 3

Author Comment

by:mmoore
Comment Utility
interesting.
 after a startup mount
 I can get results from select * from V$DATABASE and V$LOCK (no files are locked and only one row I think)
if I do a select on V$DATAFILE  the system just hangs.
Remember this is before the database is open.
 
It says that these table get info from the control file.....possible problem with control file?
 
I'm thinking of rebuilding the control file via the backup control file to trace reset logs or something to that effect.
Although I think it's fine.
0
 
LVL 23

Expert Comment

by:seazodiac
Comment Utility
no, controlfile is just fine, otherwise, you could not have mounted it.

can you read the portion of alert.log file germane to this mount?
0
 
LVL 47

Expert Comment

by:schwertner
Comment Utility
8.1.7.0.0  ????????????????

Patch it! Patch it to 8.1.7.4!
Remember: ORA-00600 is clear sign of bugs in the C code.
Before patching do carefully full cold backup to be able to reestablish the instance if something fails.
Also compare the Oracle and Unix compatability matrix - is 8.1.7 certified for this version of OS?
0
 
LVL 3

Author Comment

by:mmoore
Comment Utility
There were 4 files that could not be locked. While the system was down I copied those to a new name,
deleted the old ones, and renamed them back. Now I do NOT get the ORA-27086
but the system just hangs when trying to open the database. I do get an
"ORA-1109 signalled during: Alter database close normal" in the log.
To recap: I can start the instance and mount only. I cannot open the database.
While the system was down,
 I have made sure there were no oracle proccess running.
 no semaphores or shared memory segments
 deleted the lkOracleSid   file
  cound not find a sgadefOracleSid.dbf file.

At this point does any one know of a undocumented _ system parameter that
might tell the system to force the open database? I am willing to try anything.
thanks
0
 
LVL 47

Expert Comment

by:schwertner
Comment Utility
I found this:

When trying to startup a database with database files on a Network Appliance,
the following errors are received:
 
    ORA-01157: cannot identify/lock data file <xyz> - see DBWR trace file
    ORA-01110: data file <xyz>
 
Additionally the alert log has another error:
 
    ORA-27086 skgfglk: unable to lock file- already in use
 
 
The command ps -ef  | grep <sid_name> shows that there are no background  
processes for the instance.
 
 
Solution Description
--------------------
 
In this case, the System Administrator investigated the Network Appliance and  
was able to clear the locks on the files, and the database was able to be
started.
 
Network Appliance has a way to release false locks:
 
    As root on the netapp, from the prompt:
 
    rc_toggle_basic
    sm_mon -l <hostname>  
 
The hostname being referenced in the above command should be the machine name
where the Oracle instance is running from.
 
 
Explanation
-----------
 
Oracle 8.1.x requires locks on database files and with an existing lock on the  
file, the database could not be started. These locks may be retained by the  
Network Appliance due to an instance or host machine failure.  In these cases,  
the locks must be manually released from the Network Appliance by the System  
Administrator.
0
 
LVL 38

Assisted Solution

by:Jim P.
Jim P. earned 40 total points
Comment Utility
Ref Metalink #222385.1

1. Shutdown the machine and restart it, as it clears all shared memory segments
2. Use the following commands to remove the shared memory segments and
   semaphores owned by the Oracle user.

   At the unix command prompt do the following:
   
   a) ipcs -m
      This shows the memory segments owned by the oracle user. If there are some
      segments owned by Oracle and the instance is down, then record the segment
      id number and carry on with point b

   b) ipcrm -m &lt;segment id number&gt;
      This removes the segment from memory so Oracle no longer has a hold on
      that piece of memory

   c) ipcs -s
      This is like the ipcs -m command, but this is for semaphores. Like the
      memory segments, semaphores has segment ids also. Record the ones owned
      by Oracle so they can be removed
   
   d) ipcrm -s &lt;segment id number&gt;

If on the otherhand there are more than one Oracle instances on the machine,
then it is far better to shut those active instances down before progressing
with the points documented above.

Also Ref metalink #Note:1013221.6
 
                   
Bookmark      Fixed font       Go to End      

Doc ID:       Note:1013221.6      
Subject:       RECOVERING FROM A LOST DATAFILE IN A ROLLBACK TABLESPACE      
Type:       TROUBLESHOOTING      
Status:       PUBLISHED      
Content Type:       TEXT/PLAIN      
Creation Date:       16-OCT-1995      
Last Revision Date:       20-APR-2004      

Problem Description:  


====================  

This is a recovery scenario in which a datafile in a rollback segment
tablespace has been lost or damaged to a point that Oracle cannot
recognize it anymore. Trying to startup the database will result in
ORA-1157, ORA-1110, and possibly an operating system level error such as
ORA-7360. Trying to shut down the database in normal or immediate mode
will result in ORA-1116, ORA-1110, and possibly an operating system
level error such as ORA-7368.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
WARNING -- PLEASE NOTE  
-------------------------  

The steps outlined are only to be used with the assistance of Oracle Worldwide  
Support.  Included (in step 6) is a parameter  _corrupted_rollback_segments  
that will require that you rebuild your database. Please explore all other  
options before using this parameter.  

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
Solution Description  
====================  

This recovery situation requires extra caution.  Please call  
Oracle Customer Support if you have any questions or need any  
assistance.  

The main issue in solving this problem is trying to make sure that the  
active transactions in the rollback segments do not get lost.  

Solution Explanation  

====================  


The approach to be followed depends on the specific scenario in which  
the loss of the rollback datafile is detected:  

I. THE DATABASE IS DOWN  
-----------------------  

Attempting to startup the database will result in ORA-1157 and ORA-1110.  
The solution here depends on whether the database was cleanly shut down  
or not.  

I.A. THE DATABASE WAS CLEANLY SHUT DOWN  
---------------------------------------  

If you are ABSOLUTELY POSITIVE that the database was cleanly shutdown,  
i.e., it was closed with either shutdown NORMAL or IMMEDIATE, then  
the simplest solution is to offline drop the missing datafile, open the  
database in restricted mode, and then drop and recreate the rollback  

tablespace to which the file belonged.  DO NOT follow this procedure  

if the database was shut down ABORT or if it crashed.  

The steps are:  

1. Make sure the database was last cleanly shut down.  

Check the alert.log file for this instance.  Go to the bottom of  

the file and make sure the last time you shut the database down  

you got the messages:  

"alter database  dismount  

 Completed: alter database  dismount"  

This also includes the case of a clean shutdown followed by a  

failed attempt to startup the database.  In that case, Oracle will  

issue error messages and shut itself down abort.  For the purposes  

of this solution, though, this counts as a clean shutdown.  

If that is not the case, i.e., if the last time YOU shut the database  

down it was in abort mode, or the database crashed itself, it is  

NOT safe to proceed.  You should follow the instructions for  

case I.B below.  

2. Remove all the rollback segments in the tablespace to which the lost  

   datafile belongs from the ROLLBACK_SEGMENTS parameter in the init.ora  

   file for this instance.  If you are not sure about which rollbacks are  

   in that tablespace, simply comment out the whole ROLLBACK_SEGMENTS entry.  

3. Mount the database in restricted mode.  

STARTUP RESTRICT MOUNT  

4. Offline drop the lost datafile.  

ALTER DATABASE DATAFILE '<full_path_file_name>' OFFLINE DROP;  

5. Open the database.  

ALTER DATABASE OPEN  

If you receive the message "Statement processed," move on to Step 7.  

If instead you get ORA-604, ORA-376, and ORA-1110, go to Step 6.  

6. Since opening the database failed, shut the database down and edit the      

  init.ora file for this instance.  Comment out the ROLLBACK_SEGMENTS          

 parameter and add the following line:  

_corrupted_rollback_segments = ( <rollback1>,...., <rollbackN> )  

   i.e., the above list should contain all the rollbacks originally listed in  

   the ROLLBACK_SEGMENTS parameter.    

WARNING: Use this parameter ONLY IN THIS SPECIFIC SCENARIO or as  

instructed by Oracle Customer Support.  

   Then startup the database in restricted mode:  

STARTUP RESTRICT  

7. Drop the rollback tablespace to which the datafile belonged.  

DROP TABLESPACE <tablespace_name> INCLUDING CONTENTS;  

8. Recreate the rollback tablespace with all its rollback segments.  

   Remember to bring the rollbacks online after you create them.  

9. Make the database available to all users.  

ALTER SYSTEM DISABLE RESTRICTED SESSION;  

10. Reinclude the rollbacks you just recreated in the ROLLBACK_SEGMENTS  

   parameter in the init.ora file for this instance.  If you had commented  

   out the whole ROLLBACK_SEGMENTS entry, simply uncomment it now.  If you had  

    to go through Step 6, REMOVE THE _CORRUPTED_ROLLBACK_SEGMENTS PARAMETER    

   NOW.  

11. Rebuild the database.

 

I.B. THE DATABASE WAS NOT CLEANLY SHUT DOWN  

-------------------------------------------  

This is the situation where the database was last shut down abort or crashed.  

In this case, it is almost certain that the rollback segments that had  

extents in the lost datafile still contain active transactions.  Therefore,  

the file cannot be offlined or dropped.  You must restore the lost datafile  

from a backup and apply media recovery to it.  If the database is in  

NOARCHIVELOG mode, you will only succeed in recovering the datafile if the  

redo to be applied is within the range of your online logs.  If a  

backup of the datafile is not available, please contact Oracle Customer  

Support.  

These are the steps:  

1. Restore the lost file from a backup.  

2. Mount the database.  

3. Issue the following query:  

SELECT FILE#, NAME, STATUS FROM V$DATAFILE;  

If the status of the file you just restored is "OFFLINE," you must  

online it before proceeding:  

ALTER DATABASE DATAFILE '<full_path_file_name>' ONLINE;  

4. Issue the following query:  

SELECT V1.GROUP#, MEMBER, SEQUENCE#, FIRST_CHANGE#  

FROM V$LOG V1, V$LOGFILE V2  

WHERE V1.GROUP# = V2.GROUP# ;  

This will list all your online redolog files and their respective  

sequence and first change numbers.  

5. If the database is in NOARCHIVELOG mode, issue the query:  

SELECT FILE#, CHANGE# FROM V$RECOVER_FILE;  

If the CHANGE# is GREATER than the minimum FIRST_CHANGE# of your  

logs, the datafile can be recovered.  Just keep in mind that all the  

logs to be applied will be online logs, and move on to step 6.  

If the CHANGE# is LESSER than the minimum FIRST_CHANGE# of your logs,  

the file cannot be recovered.  Your options at this point include  

restoring a full backup if one is available or forcing the database  

to open in an inconsistent state to get a full export out of it.  

For further details and to assist you in your decision, please  

contact Oracle Customer Support.  

6. Recover the datafile:  

RECOVER DATAFILE '<full_path_file_name>'  

7. Confirm each of the  logs that you are prompted for until you  

receive the message "Media recovery complete".  If you are prompted for a  

non-existing archived log, Oracle probably needs one or more of the  

online logs to proceed with the recovery.  Compare the sequence number  

referenced in the ORA-280 message with the sequence numbers of your online  

logs.  Then enter the full path name of one of the members of the redo group  

whose sequence number matches the one you are being asked for.  Keep entering  

online logs as requested until you receive the message "Media recovery  

complete".  

8. Open the database.  

 

II. THE DATABASE IS UP  

----------------------  

If you have detected the loss of the rollback datafile and the database  

is still up and running, DO NOT SHUT IT DOWN.  In most cases, it is  

simpler to solve this problem with the database up than with it down.  

Two approaches are possible in this scenario:  

A) The first one involves offlining the lost datafile, restoring it from  

   backup, and then applying media recovery to it to make it consistent  

   with the rest of the database.  This method can only be used if  

   the database is in ARCHIVELOG mode.  

B) The other approach involves offlining all the rollback segments in the  

   tablespace to which the lost datafile belongs, dropping the tablespace,  

   and then recreating it.  You may have to kill sessions that have  

   transactions in the rollbacks involved to force the rollbacks to go  

   offline.  

In general, approach II.A is simpler to apply.  It will also be faster  

if the datafile and the necessary archived logs can be quickly restored  

from backup.  However, more user transactions will error out and be rolled  

back than with approach II.B.  Because of read-consistency, queries against  

certain tables may fail with approach II.A, since the rollback extents from  

which Oracle would retrieve the data may be in the offlined datafile.  

 

APPROACH II.A: RESTORING THE DATAFILE FROM BACKUP  

-------------------------------------------------  

As mentioned before, this approach can only be followed if the database is  

in ARCHIVELOG mode.  Here are the steps:  

1. Offline the lost datafile.  

ALTER DATABASE DATAFILE '<full_path_file_name>' OFFLINE;  

        NOTE: Depending on the current amount of database activity,  

        you may have to create additional rollback segments in a different  

        tablespace to keep the database going while you take care of the  

        problem.  

2. Restore the datafile from a backup.  

3. Issue the following query:  

SELECT V1.GROUP#, MEMBER, SEQUENCE#  

FROM V$LOG V1, V$LOGFILE V2  

WHERE V1.GROUP# = V2.GROUP# ;  

This will list all your online redolog files and their respective  

sequence numbers.  

4. Recover the datafile:  

RECOVER DATAFILE '<full_path_file_name>'  

5. Confirm each of the  logs that you are prompted for until you  

receive the message "Media recovery complete".  If you are prompted for a  

non-existing archived log, Oracle probably needs one or more of the  

online logs to proceed with the recovery.  Compare the sequence number  

referenced in the ORA-280 message with the sequence numbers of your online  

logs.  Then enter the full path name of one of the members of the redo group  

whose sequence number matches the one you are being asked for.  Keep entering  

online logs as requested until you receive the message "Media recovery  

complete".  

6. Bring the datafile back online.  

ALTER DATABASE DATAFILE '<full_path_file_name>' ONLINE;  

 

APPROACH II.B: RECREATING THE ROLLBACK TABLESPACE  

-------------------------------------------------  

This approach can be used regardless of the archival mode of the database.  

The steps are:  

1. Try to offline all the rollback segments in the tablespace to which  

   the lost datafile belongs.  

ALTER ROLLBACK SEGMENT <rollback_segment> OFFLINE;  

Repeat this statement for all rollbacks in the tablespace.  

NOTE: Depending on the current amount of database activity,  

you may have to create additional rollback segments in a different  

tablespace to keep the database going while you take care of the  

problem.  

2. Check the status of the rollbacks.  

They must all be offline before they can be dropped.  

Issue the query:  

SELECT SEGMENT_NAME, STATUS FROM DBA_ROLLBACK_SEGS  

WHERE TABLESPACE_NAME = '<TABLESPACE_NAME>';  

3. Drop all offlined rollback segments.  

For each rollback returned by the query in step 2 with status  

"OFFLINE," issue the statement:  

DROP ROLLBACK SEGMENT <rollback_segment>;  

4. Handle the rollbacks that remain online.  

Repeat the query in step 2.  

If any of the rollbacks you tried to offline still has an "ONLINE"  

status, it means there are still active transactions in it.  You  

may confirm that by issuing the query:  

SELECT SEGMENT_NAME, XACTS ACTIVE_TX, V.STATUS  

FROM V$ROLLSTAT V, DBA_ROLLBACK_SEGS  

WHERE TABLESPACE_NAME = '<TABLESPACE_NAME>' AND SEGMENT_ID = USN;  

If the above query returns no rows, it means all the rollbacks in  

the affected tablespace are already offline.  Repeat the query in  

step 2 to retrieve the names of the rollbacks that just became  

offline and then drop them as described in step 3.  

If the above query returns one or more rows, they should show  

status "PENDING OFFLINE".  Next, check the ACTIVE_TX  

column for each rollback.  If it has a value of 0, it implies  

there are no pending transactions left in the rollback, and it  

should go offline shortly.  Repeat the query in step 2 a few  

more times until it shows the rollback being offline and then  

drop it as described in step 3. Move on to step 6.  

If any of the "pending offline" rollbacks has a value of 1 or  

greater in the ACTIVE_TX column, move on to step 5.  

5. Force rollbacks with active transactions to go offline.  

At this point, the only way to move forward is to have the  

"pending offline" rollbacks released.  The active transactions  

in these rollbacks must either be committed or rolled back.  

The following query shows which users have transactions assigned  

to which rollbacks:  

SELECT S.SID, S.SERIAL#, S.USERNAME, R.NAME "ROLLBACK"  

FROM V$SESSION S, V$TRANSACTION T, V$ROLLNAME R  

WHERE R.NAME IN ('<PENDING_ROLLBACK_1>', ... , '<PENDING_ROLLBACK_N>')  

AND S.TADDR = T.ADDR AND T.XIDUSN = R.USN;  

You may directly contact the users with transactions in the  

"pending offline" rollbacks and ask them to commit (preferably)  

or rollback immediately.  If that is not feasible, you can force  

that to happen by killing their sessions.  For each of the entries  

returned by the above query, issue the statement:  

ALTER SYSTEM KILL SESSION '<SID>, <SERIAL#>';  

where <SID> and <SERIAL#> are those returned by the previous  

query.  After the sessions are killed, it may take a few minutes  

before Oracle finishes rolling back and doing cleanup work.  Go back  

to step 2 and repeat the query in there periodically until all  

rollbacks in the affected tablespace are offline and ready to be  

dropped.  

6. Drop the rollback tablespace.  

DROP TABLESPACE <tablespace_name> INCLUDING CONTENTS;  

If this statement fails, please contact Oracle Customer Support.  

Otherwise, proceed to step 7.  

7. Recreate the rollback tablespace.  

8. Recreate the rollback segments in the tablespace and bring them online.
.       
      Copyright (c) 1995,2000 Oracle Corporation. All Rights Reserved. Legal Notices and Terms of Use.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Truncate is a DDL Command where as Delete is a DML Command. Both will delete data from table, but what is the difference between these below statements truncate table <table_name> ?? delete from <table_name> ?? The first command cannot be …
Note: this article covers simple compression. Oracle introduced in version 11g release 2 a new feature called Advanced Compression which is not covered here. General principle of Oracle compression Oracle compression is a way of reducing the d…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now