Partition was full, now not full, but df think's its full yet

Mark Geerlings
Mark Geerlings used Ask the Experts™
on
We had an Oracle process accidentally fill up one of the partitions in our Red Hat AS4 server with a single, large file.  No, this is not the / partition.  It is an ext3 data partition that we created for part of the Oracle database that we have on this server.  In Oracle, I moved this dat to another partition, then I dropped the tablespace (the file that the O/S sees) that was on this partition and I did a "umount -l" and then remounted it, but an "ls -lf" command still shows this partition as 100% full, even though it doesn't show the large file anymore.  A "du -x" command correctly shows it as empty.  Can I convince Linux (and/or the ext3 file system on this partition) that this partition really isn't 100% full anymore (short of a reboot)?  I won't have a scheduled opportunity for a server reboot until the weekend of August 8.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2009
Commented:
Have a look to this mailinglist, it might give you some idea about whats going on

http://osdir.com/ml/file-systems.ext3.user/2003-04/msg00007.html


what i am thinking is, you have delelated that DAT file, but there might be any process still holding up that DAT file hence df is thinking that file( or block ) still in that partition ...

what about lsof +d /Thatpartion

here i am just trying to make a sence ....
Top Expert 2007
Commented:
any chance that you a have a process that has files open on that partition? if so then stop it.

you may try torun fsck on that filesystem
Mark GeerlingsDatabase Administrator

Author

Commented:
When I did "lsof" before I did "umount -l", I was amazed by the output because it still showed the large file that I had deleted with "(deleted)" after the filename in the lsof output!  After did "umount -l", then remounted this partition and tried "lsof" again, the large file no longer shows up.  So, "du" is correct, and "ls" is correct, but when I do "df" it still thinks this partition is 100% full.  That's my issue.  How can I convice the "df" command that the partition is no longer full?
How to Generate Services Revenue the Easiest Way

This Tuesday! Learn key insights about modern cyber protection services & gain practical strategies to skyrocket business:

- What it takes to build a cloud service portfolio
- How to determine which services will help your unique business grow
- Various use-cases and examples

Top Expert 2007

Commented:
run fsck
Mark GeerlingsDatabase Administrator

Author

Commented:
I'm not a master of Linux, I'm an Oracle DBA running Oracle on a Linux server that we don't have a System Administrator for, so I have to act like a Linux SysAdmin, even though I have no UNIX and only limited Linux experience.

Can I run fsck on a mounted partition, or must I "umount" it again first?
Top Expert 2007

Commented:
Its better if it is unmounted. What is contained in that file system?
Mark GeerlingsDatabase Administrator

Author

Commented:
There is only one small (14M) file left in that file system.  It is one of the three copies of the "controlfile" for the Oracle database that is running on the server.  But, since there are two other copies of the "controlfile" on other partitions, I can "umount" this partition, if necessary without crashing the Oracle database.  It will continue to run using the other copies of the "controlfile".

I just read the link that fosiul0 recommended.  That does seem to describe the problem I have.  I do find this to be very annoying "feature" of UNIX/Linux systems.  I cannot stop the process that used this file, because that is the Oracle database process.  I won't have a (scheduled) opportunity to do that for almost three weeks.

Without stopping this process or rebooting the server, is there any way to get "df" to recognize that the space is now available?
Top Expert 2009

Commented:
Ok what about your this tool
bbed tool
from this site : http://www.linuxstreet.net/news/E/9272/Disassembling-the-Oracle-Data-Block-on-Linux.html

AGain i never used Oracle in my Life, so please read the article before doing anything
Top Expert 2009

Commented:
you can try what @omarfarid said

it might help you.

have a look this one
http://docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWaadm/SYSADV1/p167.html

Mark GeerlingsDatabase Administrator

Author

Commented:
I don't think that help bbed tool will help me.  This is not an Oracle problem (even though Oracle caused the problem).  Now, it is purely an O/S and/or ext3 problem: getting "df" to accept the fact that the formerly-used space is now available.
Top Expert 2009

Commented:
Ok read this one commenat from  : Understanding the Linux Kernel, 3rd Edition



18.6.6. Releasing a Data Block

When a process deletes a file or truncates it to 0 length, all its data blocks must be reclaimed. This is done by ext2_truncate( ), which receives the address of the file's inode object as its parameter. The function essentially scans the disk inode's i_block array to locate all data blocks and all blocks used for the indirect addressing. These blocks are then released by repeatedly invoking ext2_free_blocks( ).

The ext2_free_blocks( ) function releases a group of one or more adjacent data blocks. Besides its use by ext2_truncate( ), the function is invoked mainly when discarding the preallocated blocks of a file (see the earlier section "Allocating a Data Block"). Its parameters are:


inode

    The address of the inode object that describes the file

block

    The logical block number of the first block to be released

count

    The number of adjacent blocks to be released

The function performs the following actions for each block to be released:

   1.

      Gets the block bitmap of the block group that includes the block to be released
   2.

      Clears the bit in the block bitmap that corresponds to the block to be released and marks the buffer that contains the bitmap as dirty.
   3.

      Increases the bg_free_blocks_count field in the block group descriptor and marks the corresponding buffer as dirty.
   4.

      Increases the s_free_blocks_count field of the disk superblock, marks the corresponding buffer as dirty, and sets the s_dirt flag of the superblock object.
   5.

      If the filesystem has been mounted with the MS_SYNCHRONOUS flag set, it invokes sync_dirty_buffer( ) and waits until the write operation on the bitmap's buffer terminates.


So that meants it can be done


may be fsck will do that, but if it does not work then have to find out what execute ext2_truncate( ) commadn .....

Mark GeerlingsDatabase Administrator

Author

Commented:
It looks like "fsck" did what I wanted.  Here are the significant lines from the "fsck" output:

/dev/emcpowerg1: recovering journal
Clearing orphaned inode 12 (uid=503, gid=503, mode=0100640, size=4948238336)
/dev/emcpowerg1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/emcpowerg1: ********** WARNING: Filesystem still has errors **********

The last line concerns me a bit.  I have no idea what those "errors" are, or how to find them or fix them.  But "df" now does show the partition (correctly) as only 2% full.

Top Expert 2009
Commented:
Mark,

I've seen this happen with a mounted filesystem, but if a process has the file open, I've not seen a normal "umount" command succeed. It should say "filesystem busy". If you indeed umounted it, then there is now way a file can still be held open.

I'm thinking that if you can umount the partition, then you can reformat it with a fresh filesystem and remount it.

You will have to do something with the copy of the controlfile, though. I assume you have 2-3 other copies of it, so you know what to do there. The following step will format the whole partition.

Assume partition is /dev/sdc1 mounted on /u03

umount /u03
df (verify it is not mounted)
mkfs.ext3 /dev/sdc1
umount /u03

Top Expert 2009

Commented:
do you have any message at

 /var/log/messages regarding this ??

Top Expert 2009

Commented:
sorry, I did not notice that you did a umount -l

That is why I was confused as to how you could umount it with an open file. So a plain umount (without -l) would still cause the error possibly. Did you try a plain umount?
Mark GeerlingsDatabase Administrator

Author

Commented:
I see these two lines in /var/log/messages for this partition:

Jul 20 10:48:58 zs110-oradb3 kernel: post_create:  setxattr failed, rc=28 (dev=emcpowerg1 ino=11)
Jul 20 10:49:14 zs110-oradb3 kernel: post_create:  setxattr failed, rc=28 (dev=emcpowerg1 ino=11)

I think the suggestion to reformat this partition may be my safest option.  No, I didn't do a simple "umount" because I did get the "filesystem busy" error due to the Oracle controlfile that was still there.  So, I did "umount -l".

I'll plan to copy that controlfile elsewhere, then reformat the partition, make a new filesystem, mount it and copy the controlfile back in, unless someone tells me "no" or gives me a better idea in the next few minutes while I heat up my lunch.
Top Expert 2009

Commented:
I wanted to tell you to reformat the partion before but i thought you would not like that!!!

there is a command

fuser -km /partion

Ref:
http://www.cyberciti.biz/tips/how-do-i-forcefully-unmount-a-disk-partition.html


but you dont want to kill that procs i guess!!


Top Expert 2009

Commented:
Same as your problem!!
http://www.experts-exchange.com/OS/Linux/Q_21660292.html

fsck -f fixed the problem but he had to reboot the system

you can wait till August 8 and reboot sytem or you can reformat the partion

Top Expert 2009

Commented:
mark, just curious what your exact version of Linux distribution and kernel are. I see you are using EMC / Powerpath but are using ext3 filesystems. I am a linux advocate but I cannot ignore when I see a problem like this, and I like to take a mental note of it.

I did have an issue a couple of years ago doing some maintennance on Enterprise Linux 4, but I was resizing some partitions, which is touchy anyway. But since moving to EL 5 I have not seen a single problem, however, I would guess that before today you'd not had a problem of this type either, eh? :)
Mark GeerlingsDatabase Administrator

Author

Commented:
Here are the Linux version and kernel details:

# cat /etc/issue
Red Hat Enterprise Linux AS release 4 (Nahant Update 3)
Kernel \r on an \m

 # uname -r
2.6.9-34.0.2.ELhugemem

Yes, we have an EMC SAN attached to this server that we use for the Oracle datafiles.  Are you aware of a problem with using the EMC Powerpath software and ext3 filesystems and/or Linux?  We've been using this combination for about four years.

Our newer system (that just went into production this month) is based on EL 5, but we still have parts of the business running on the EL 4 server.
Top Expert 2009

Commented:
Mark, I'm not aware of any problems, until now, but I think it is purely a kernel or ext3 issue, nothing to do with EMC. I am glad to hear it is a EL 4 system that this happened on, it gives me renewed confidence in EL 5. I am running 5.0 and 5.3 in production and have not had the same issue with resizing filesystems as I did in El 4. But it was easy to test and reproduce in my case, where in your case it is probably not.

I am in hopes that btrfs (Oracle's filesystem initiative for Linux) turns out to be what I expect it to be.


Mark GeerlingsDatabase Administrator

Author

Commented:
I really appreciated the prompt explanation from fosiul0 and the prompt suggestion to run "fsck" from omarfarid.  And I appreciated the confirmation from mrjoltcola that the other suggestions were on the right track.

I built a new filesystem on this partition, then made a mistake that caused our Oracle database to crash.  After I mounted the newly-created file system, I copied back in the Oracle controlfile (which was now out-of-sync) that had been on this partition.  That crashed the database instance.  I should have copied in one of the other copies of the controlfile, then renamed it to what the database expected to find on this partition.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial