demienx
asked on
Negative value for available disk space
Hello,
actually one FreeBSD 4.10 machine is reporting a negative value for available disk space on /.
Being told that this may be caused by some scripts not properly closing open file handles,
is there any was I can monitor the actual open files for writting or better way to find the cause of this weird report?
Thanks in advance!
actually one FreeBSD 4.10 machine is reporting a negative value for available disk space on /.
Being told that this may be caused by some scripts not properly closing open file handles,
is there any was I can monitor the actual open files for writting or better way to find the cause of this weird report?
Thanks in advance!
note that FreeBSD keeps a certain amount of space reserved for the root user (I believe it's 5% or 10%) so that you might see a filesystem/partition go above 100% utilization ...that means you're really absolutely completely full on that partition/fs. Do a df -h and see which FS's are full. You may want to empty some space if you can and consider doing a "make distclean" in /usr/ports if you haven't in a while.
ASKER
Actually the / partition was 102% usage, but after reboot and fsck'ed, it turned into 12% usage again...
That's what is quite weird
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
That is not weird - you deleted an open file, and process kept it open and continued to write to it.
lsof -n | grep VCHR | grep -v /
will show offender process.
lsof -n | grep VCHR | grep -v /
will show offender process.
ASKER
Sorry for the late reply.
I've been waiting for the situation to reproduce again. Actually it's happening, and tried to see if there was an open file being deleted like gheist suggested, however seems that's not the cause.
Once it starts in about 2 hours it fills the 73G drive from it's normal usage of 12%.
Any ideas about?
I've been waiting for the situation to reproduce again. Actually it's happening, and tried to see if there was an open file being deleted like gheist suggested, however seems that's not the cause.
Once it starts in about 2 hours it fills the 73G drive from it's normal usage of 12%.
Any ideas about?
ASKER
Another interesting thing is that, running du or rurep on / with -x option, it reports 7.1 G usage.
I just wonder how does df determines the disk usage and how does du or durep.
May be there's some port able of determining and list the usage in the same way df does?
What does mount -v tell about that mount point in question ???
We need to know more about what your partition structure looks like and which partition is filling. in the intial question you said it was your root partition filling, but then in a later comment you said an entire 73G drive is filling.
What are your partitions, how big are they, what filesystems do they use, and which ones are filling?
In PARTICULAR, if it's your / partition that's filling up - do you have a separate partition for /tmp?
What are your partitions, how big are they, what filesystems do they use, and which ones are filling?
In PARTICULAR, if it's your / partition that's filling up - do you have a separate partition for /tmp?
ASKER
# mount -v
/dev/amrd0s1a on / (ufs, local, soft-updates, writes: sync 15794 async 2149728, reads: sync 53777 async 13404)
/dev/amrd1s1a on /www (ufs, local, soft-updates, writes: sync 63 async 160024, reads: sync 2018448 async 2185387)
procfs on /proc (procfs, local)
# df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/amrd0s1a 70469488 65424488 -592556 101% /
/dev/amrd1s1a 567892762 278256984 244204358 53% /www
procfs 4 4 0 100% /proc
# df -i
Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on
/dev/amrd0s1a 70469488 65425472 -593540 101% 225884 4195234 5% /
/dev/amrd1s1a 567892762 278258144 244203198 53% 1429259 140886515 1% /www
procfs 4 4 0 100% 247 3869 6% /proc
Actually the drive that gets filled is mounted on /, the web server files are in a raid 0 of 2x300GB SCSI drives.
/dev/amrd0s1a on / (ufs, local, soft-updates, writes: sync 15794 async 2149728, reads: sync 53777 async 13404)
/dev/amrd1s1a on /www (ufs, local, soft-updates, writes: sync 63 async 160024, reads: sync 2018448 async 2185387)
procfs on /proc (procfs, local)
# df
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/amrd0s1a 70469488 65424488 -592556 101% /
/dev/amrd1s1a 567892762 278256984 244204358 53% /www
procfs 4 4 0 100% /proc
# df -i
Filesystem 1K-blocks Used Avail Capacity iused ifree %iused Mounted on
/dev/amrd0s1a 70469488 65425472 -593540 101% 225884 4195234 5% /
/dev/amrd1s1a 567892762 278258144 244203198 53% 1429259 140886515 1% /www
procfs 4 4 0 100% 247 3869 6% /proc
Actually the drive that gets filled is mounted on /, the web server files are in a raid 0 of 2x300GB SCSI drives.
ASKER
Actually I was looking at apache server error log and noticed quite amount of lines like:
[Fri Jul 22 07:12:19 2005] [notice] child pid 73132 exit signal Segmentation fault (11)
I don't if this may be cause for the drive being completelly full right now, but I remember some time ago having such error messages in another box and finally the culprit was the ram chips...
Any thougths about?
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
have you tried booting to single user mode and fsck'ing the root filesystem?
ASKER
Hello,
thanks to all for the valuable tips. I reckon the partitioning scheme for that box was not the best being demostrated the issues caused on that filling issues.
Actually I have disabled cronolog piperd log rotation for all sites temorarily and only left working the error log to determine if that was the cause, and after some days, it appears that this log rotation was the cause.
I wonder why it was causing issues in this server, as actually the log rotation method is being used in the rest of servers without causing any issues (and worked for months correctly before started to behave like that).
Also don't know why the suggested command from gheist "lsof -n | grep VCHR | grep -v /" was not showing any files at all during the issue... Any idea about?
Thanks All,
DemienX.
thanks to all for the valuable tips. I reckon the partitioning scheme for that box was not the best being demostrated the issues caused on that filling issues.
Actually I have disabled cronolog piperd log rotation for all sites temorarily and only left working the error log to determine if that was the cause, and after some days, it appears that this log rotation was the cause.
I wonder why it was causing issues in this server, as actually the log rotation method is being used in the rest of servers without causing any issues (and worked for months correctly before started to behave like that).
Also don't know why the suggested command from gheist "lsof -n | grep VCHR | grep -v /" was not showing any files at all during the issue... Any idea about?
Thanks All,
DemienX.
Because your disk is fscking full ..... with data.
The problem I mentioned happens when multiple admins do not coordinate what they do or unskilled users delete their open files.
The problem I mentioned happens when multiple admins do not coordinate what they do or unskilled users delete their open files.
Have you tried doing df -hd 1 when you have the problem? That would tell you for sure where the majority of the dreck was congealing.
Note: it is very possible to simply migrate this install to a new and betterly-partitioned drive without losing anything. All you have to do is set up proper partitions, give the drive a normal MBR, and then do cp -Rpv from the directories on the old drive to each of the partitions on the new, and presto, you're good to go.
Note: it is very possible to simply migrate this install to a new and betterly-partitioned drive without losing anything. All you have to do is set up proper partitions, give the drive a normal MBR, and then do cp -Rpv from the directories on the old drive to each of the partitions on the new, and presto, you're good to go.
I havent used it myself, but to quote from them:-
"Lsof is a Unix-specific diagnostic tool. Its name stands for LiSt Open Files, and it does just that. It lists information about any files that are open by processes currently running on the system. It can also list communications open by each process."
Sounds like what you want..