Link to home
Start Free TrialLog in
Avatar of bkreynolds48
bkreynolds48

asked on

root file system full

I have a SUN T2000 server - the root file system filled up -
I can't find what file filled it up.  I have tried du
Is there a way to find large files?
Avatar of rockiroads
rockiroads
Flag of United States of America image

have u cleared files in like /tmp and /usr/tmp and any other temp folders you know off?
Avatar of bkreynolds48
bkreynolds48

ASKER

Is it OK to remove all the files in the /tmp dir?
some samples here to help find large files http://www.computing.net/answers/unix/root-filesystem-is-95-full/6539.html


ls -F . | grep -v '|' | grep -v "=" | grep -v "@" | while read name; do du -sk $name; done | sort -nr | head -20

df -k | awk '{print $7}' | sed 's?/? ?g' | sort -k 1 | grep -v 'Mounted' | awk '{print "/"$1"/d"}' | sort -u | tail +2 > fs.sed
ls -F . | sed -f fs.sed | sed '/!/d;/=/d;/@/d' | while read name; do du -sk $name; done | sort -nr | head
rm fs.sed

find /usr -type f -print | xargs ls -l | sort -r -n -k 5,5 | head -20


In the find, to search from root you can do find / instead
should be okay. as a safety precaution you could just delete files older than a certain date if you want

something like this I think

find /tmp -mtime +3 -print

thats older than 3 days. I think that syntax is right. I dont have unix available anymore.
have u any background jobs (cron) setup that produce logging? perhaps they need to be cleaned up
There was an rsync job that I think is the problem - I unmount the file system where the job was supposed to go because I am removing that mount point - I forgot about the rsync job so not sure where it tried to put the data
I have deleted a bunch of files with no diskspace returned.
lot of tmp files might of been too small to make a difference

did u try find the largest files? find is probably better in this case
I should only be looking on those on the root  "/" partition because that is full - right?
Filesystem             size   used  avail capacity  Mounted on
/dev/md/dsk/d10         18G    18G    87M   100%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                    15G   1.4M    15G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
sharefs                  0K     0K     0K     0%    /etc/dfs/sharetab
/platform/SUNW,Sun-Fire-T200/lib/libc_psr/libc_psr_hwcap1.so.1
                        18G    18G    87M   100%    /platform/sun4v/lib/libc_psr.so.1
/platform/SUNW,Sun-Fire-T200/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1
                        18G    18G    87M   100%    /platform/sun4v/lib/sparcv9/libc_psr.so.1
fd                       0K     0K     0K     0%    /dev/fd
/dev/md/dsk/d40         18G   7.4G    11G    41%    /var
swap                    15G   312K    15G     1%    /tmp
swap                    15G    56K    15G     1%    /var/run

yes. its not very big huh. do a ls -l on root and see what is large
dont remove anything yet as you may end up messing boot
if u run   du -sh /*
it should list those big dirs
ls -l
total 1080
drwxr-xr-x   2 root     root         512 Aug 13  2008 a
drwxr-xr-x   7 oracle   dba          512 Apr 14 17:23 backup
lrwxrwxrwx   1 root     root           9 Jul 14  2008 bin -> ./usr/bin
drwxr-xr-x   3 root     sys          512 Feb 22  2010 boot
drwxr-xr-x   2 root     root         512 Jul 21  2008 cdrom
drwxr-xr-x  18 root     sys         5120 Jul 22 19:59 dev
drwxr-xr-x   2 root     sys          512 Jul 22 19:56 devices
drwxr-xr-x  82 root     sys         4608 Sep 10 13:15 etc
drwxr-xr-x   3 root     sys          512 Jul 14  2008 export
-rw-r--r--   1 root     root           6 Sep 18 16:20 fs.sed
dr-xr-xr-x   1 root     root           1 Jul 22 19:57 home
drwxr-xr-x  15 root     sys          512 Jul 16  2008 kernel
drwxr-xr-x   7 root     bin         5632 Feb 22  2010 lib
drwxr-xr-x   3 root     root        1536 Sep  8 17:25 log
drwx------   2 root     root        8192 Jul 14  2008 lost+found
drwxr-xr-x   2 root     sys          512 Jul 14  2008 mnt
dr-xr-xr-x   1 root     root           1 Jul 22 19:57 net
drwxr-xr-x  17 root     root         512 Sep 18 14:13 opt
drwxr-xr-x   4 oracle   dba          512 Jul 24  2008 oradata1
drwxr-xr-x   4 oracle   dba          512 Jul 28  2008 oradata2
drwxr-xr-x   4 oracle   dba          512 Jul 28  2008 oradata3
drwxr-xr-x   4 oracle   dba          512 Jul 28  2008 oradata4
drwxr-xr-x   4 oracle   dba          512 Jul 28  2008 oradata5
drwxr-xr-x   4 oracle   dba          512 Jan 26  2009 oradata6
drwxr-xr-x   4 oracle   dba          512 Apr 12 14:56 oradata7
drwxr-xr-x   4 oracle   dba          512 Jul 28  2008 oradata8
drwxr-xr-x   6 root     sys         1024 Feb 22  2010 platform
dr-xr-xr-x  54 root     root      480032 Sep 18 16:30 proc
drwxr-xr-x   7 oracle   100         4096 Sep 18 13:42 prodbackup
drwxr-xr-x   2 root     sys         1024 Feb 22  2010 sbin
drwxrwxrwx   3 root     root        4096 May 28  2007 smo
drwxr-xr-x   4 root     root         512 Jul 14  2008 system
drwxrwxrwt   8 root     sys         1340 Sep 18 16:30 tmp
drwxr-xr-x  42 root     sys         1024 Feb 22  2010 usr
drwxr-xr-x  45 root     sys         1024 Jul 23 13:02 var
-rw-r--r--   1 root     root        1461 Aug 13  2008 vfstab
drwxr-xr-x   2 root     root         512 Jul 14  2008 vol

The ones bolded are mounted from the array not the server
whats in /proc  ?
total 1046
dr-x--x--x   5 root     root         832 Jul 22 19:56 0
dr-x--x--x   5 root     root         832 Jul 22 19:56 1
dr-x--x--x   5 root     root         832 Jul 22 19:56 2
dr-x--x--x   5 root     root         832 Jul 22 19:56 3
dr-x--x--x   5 root     root         832 Jul 22 19:56 7
dr-x--x--x   5 root     root         832 Jul 22 19:56 9
dr-x--x--x   5 root     root         832 Jul 22 19:57 162
dr-x--x--x   5 root     root         832 Jul 22 19:57 177
dr-x--x--x   5 daemon   daemon       832 Jul 22 19:57 178
dr-x--x--x   5 root     root         832 Jul 22 19:57 179
dr-x--x--x   5 root     root         832 Jul 22 19:57 232
dr-x--x--x   5 root     root         832 Jul 22 19:57 408
dr-x--x--x   5 daemon   daemon       832 Jul 22 19:57 515
dr-x--x--x   5 nagios   nagios       832 Jul 22 19:57 528
dr-x--x--x   5 daemon   daemon       832 Jul 22 19:57 631
dr-x--x--x   5 daemon   daemon       832 Jul 22 19:57 705
dr-x--x--x   5 root     root         832 Jul 22 19:57 717
dr-x--x--x   5 root     root         832 Jul 22 19:57 719
dr-x--x--x   5 root     root         832 Jul 22 19:57 723
dr-x--x--x   5 root     root         832 Jul 22 19:57 724
dr-x--x--x   5 root     root         832 Jul 22 19:57 823
dr-x--x--x   5 root     root         832 Jul 22 19:57 967
dr-x--x--x   5 root     root         832 Jul 22 19:57 968
dr-x--x--x   5 root     root         832 Jul 22 19:57 1040
dr-x--x--x   5 root     root         832 Jul 22 19:57 1106
dr-x--x--x   5 root     other        832 Jul 22 19:57 1129
dr-x--x--x   5 root     root         832 Jul 22 19:57 1132
dr-x--x--x   5 root     root         832 Jul 22 19:57 1241
dr-x--x--x   5 root     root         832 Jul 22 19:57 1294
dr-x--x--x   5 root     root         832 Jul 22 19:57 1316
dr-x--x--x   5 root     root         832 Jul 22 19:57 1317
dr-x--x--x   5 root     root         832 Jul 22 19:57 1320
dr-x--x--x   5 smmsp    smmsp        832 Jul 22 19:57 1322
dr-x--x--x   5 root     root         832 Jul 22 19:57 1324
dr-x--x--x   5 root     root         832 Jul 22 19:57 1332
dr-x--x--x   5 lp       lp           832 Jul 22 19:57 1333
dr-x--x--x   5 root     root         832 Jul 22 19:57 1373
dr-x--x--x   5 root     root         832 Jul 22 19:57 1375
dr-x--x--x   5 root     root         832 Jul 22 19:58 2903
dr-x--x--x   5 root     root         832 Jul 22 19:58 2912
dr-x--x--x   5 root     sys          832 Jul 22 19:58 2986
dr-x--x--x   5 root     root         832 Jul 22 19:59 3096
dr-x--x--x   5 root     root         832 Jul 23 13:05 6689
dr-x--x--x   5 root     root         832 Jul 23 15:37 9015
dr-x--x--x   5 root     root         832 Sep 18 11:00 22328
dr-x--x--x   5 root     root         832 Sep 18 11:00 22330
dr-x--x--x   5 root     root         832 Sep 18 13:51 25483
dr-x--x--x   5 root     root         832 Sep 18 15:34 27084
dr-x--x--x   5 root     root         832 Sep 18 15:38 27158
dr-x--x--x   5 root     root         832 Sep 18 15:38 27159
dr-x--x--x   5 root     root         832 Sep 18 15:38 27172
drwxr-xr-x  47 root     root        1536 Sep 18 16:32 ..
dr-x--x--x   5 root     root         832 Sep 18 16:35 2630
dr-xr-xr-x  54 root     root      480032 Sep 18 16:35 .
look in /proc size is 480032
whats in there
Are you oracle fs off root? They could be a problem.
I believe if they are and tge tablespaces are set to auto extend it could mean trouble.
oracle has it's own mount point - different disks - no auto extend is ever turned on
look in /proc size is 480032
whats in there
That is gone now
Do you think the system needs to be booted?  I can do that later - or in the morning
I dont think proc is your problem

Du -sh <dir>

On anything not on it's own mount point
du -sh was done ealier, the results are shown but cant see anything plainly obvious bar /proc
I am wondering if the rsync filled up the root slice then failed and whatever was there was removed - would the disk space not be returned because it went over the allotted space - thus needing the boot I asked about?
depends on the failure, whatever it could be it could of left remnants of something floating around
Where is the DU? I see the ls -l but not the du?

depends I don't remember off the top of my head but some programs leave file parts when xfering.
I did a du -sh on all root mounted directories and found none over a few meg
well it is not proc, it is not really a physical fs

I think you should continue with looking at individual directories that don't have their own mount point. Originally you said " I unmount the file system where the job was supposed to go because I am removing that mount point"

If you think it is the rsync job then what was this path? Check there.
The path for the rsync job would have been /backup not /
If /backup was not there would rsync have tried to write to /???
I believe it will create backup, can you post the rsync job?


http://www.ibm.com/developerworks/aix/library/au-filesync/index.html?ca=dgr-lnxw07UNIX-File-Sync&S_TACT=105AGX59&S_CMP=grsitelnxw07

Listing 2. Copying multiple directories to a backup directory
$ mkdir backup
$ rsync dira backup
$ rsync dirb backup

Listing 2 creates a directory, backup/dira, containing a copy of the original dira. It also creates a directory, backup/dirb, containing a copy of the original dirb. The following does something different: $ rsync dira backup/dira. The first time you use it, the script will do what you expect. But the second time you use the option, rsync will create the destination directory within the specified destination directory, creating the directory backup/dira/dira. Not only does this not create the structure you want, it also doubles up the contents (one of which will never be synchronized).
from what I gather rsync does not create the target directory you would have to create it
assuming this is the same rsync http://sial.org/howto/rsync/ 

# Create the target backup directory on the server.

rsync will not create the target directory ($HOME/backup/client) on the server; the target directory must be manually created.

So I guess that could be an issue. Im not sure what would happen if that is the case.
I guess it might depend on rsync version and options.


If it doesn't create the backup dir then the rsync would probably fail and wouldn't be filling up the fs.
check for open filehandles using lsof if you can. There may be a process writing to a file, that has yet to close which is causing your issues.

When you posted the output from df, did you remove all the mount points from the array?  If not, then that's your problem as you local filesystem would have filled.

If that's not the case, your most likely candidates for filling up are the /var and /home directories.
@tintin

I was wondering the same thing, because the oracle mount points are not listed and the op stated that oracle did have it's own mountpoints.
/var has it's own mountpoint
/home has nothing in it
/export/home has it's own mount point
=======================================
These file systems are on the / partition
=======================================

Filesystem             size   used  avail capacity  Mounted on
/dev/md/dsk/d10         18G    18G    87M   100%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                    13G   1.4M    13G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
sharefs                  0K     0K     0K     0%    /etc/dfs/sharetab
/platform/SUNW,Sun-Fire-T200/lib/libc_psr/libc_psr_hwcap1.so.1
                        18G    18G    87M   100%    /platform/sun4v/lib/libc_psr.so.1
/platform/SUNW,Sun-Fire-T200/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1
                        18G    18G    87M   100%    /platform/sun4v/lib/sparcv9/libc_psr.so.1
fd                       0K     0K     0K     0%    /dev/fd
/dev/md/dsk/d40         18G   4.7G    13G    27%    /var
swap                    13G   312K    13G     1%    /tmp
swap                    13G    56K    13G     1%    /var/run
===============================================================
I don't know how to use lsof  
Here is the complete df -h
===============================
Filesystem             size   used  avail capacity  Mounted on
/dev/md/dsk/d10         18G    18G    87M   100%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                    13G   1.4M    13G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
sharefs                  0K     0K     0K     0%    /etc/dfs/sharetab
/platform/SUNW,Sun-Fire-T200/lib/libc_psr/libc_psr_hwcap1.so.1
                        18G    18G    87M   100%    /platform/sun4v/lib/libc_psr.so.1
/platform/SUNW,Sun-Fire-T200/lib/sparcv9/libc_psr/libc_psr_hwcap1.so.1
                        18G    18G    87M   100%    /platform/sun4v/lib/sparcv9/libc_psr.so.1
fd                       0K     0K     0K     0%    /dev/fd
/dev/md/dsk/d40         18G   4.7G    13G    27%    /var
swap                    13G   312K    13G     1%    /tmp
swap                    13G    56K    13G     1%    /var/run
/dev/dsk/c0t2d0s0       34G    25G   8.7G    75%    /oradata1
/dev/dsk/c0t3d0s0       34G    23G   9.8G    71%    /oradata5
/dev/dsk/c0t2d0s1       34G    19G    15G    57%    /oradata2
/dev/dsk/c0t3d0s1       34G    23G   9.8G    71%    /oradata6
/dev/dsk/c0t3d0s3       34G    14G    20G    42%    /oradata7
/dev/dsk/c0t3d0s4       34G    21G    13G    62%    /oradata8
/dev/dsk/c0t2d0s3       34G    14G    19G    43%    /oradata3
/dev/dsk/c0t2d0s4       34G    23G   9.8G    71%    /oradata4
/dev/md/dsk/d30         21G   8.4G    12G    41%    /export/home
netapp:/vol/snap_backup/qt_smo
                        40G   1.3G    39G     4%    /smo
netapp:/vol/prodbackup
                       250G   133G   117G    54%    /prodbackup
/dev/dsk/c2t5d0s6      295G    27G   265G    10%    /backup
 
ASKER CERTIFIED SOLUTION
Avatar of jgiordano
jgiordano
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That's it - I umounted /backup the did an ls on /backup and sure enough all the rsync stuff is there.
Thanks
/tmp is swapfs, so removing stuff from there obviously will not help. /var is it's own (not full) filesystem, so removing stuff from there will also not help.

That leaves files on the root file system.  The du command will show you what files are filling up your system. If it does not, the problem could be invisible files.

There are two ways that a file can be invisible. One is for a process to open a file and delete it while it is open. After
it is deleted, the space will remain in use until the process closes the file or exits, at which time all of the space used
by the file is freed up. If you know what process has such a file open, then killing the process will free up the space.
Otherwise rebooting the system will kill all the processes and free up the space.

The second way a file can be invisible is for it to be linked in a directory, but the directory itself can be a mount point
for another file system. Some versions of the OS will refuse to mount a file system on a non-empty directory,
but this is not fool-proof; it can still happen.  

You said that you had a bunch of filesystems that were not mounted at one point. In particular, you mentioned
one that rsync wrote to. These would be the most likely places to look for these hidden files. For instance,
if rsync created a large file under a mountpoint while the file system was unmounted, and then the filesystem was
mounted, then the large file would not be visible.

So, the thing to do it unmount those systems, and look in the mount point directories. They should all be empty.
While the filesystems are unmounted, use du to try and isolate where the space is going. One place to check is
/opt.

You can ignore /proc. First as it can be seen on the df output, it is not part of the root file system. Second the
space reported is not real and is simply the result of the address space of all the current processes appearing
as pseudo-files.