Solved

How to handle "Disk Full" errors cleanly

Posted on 1997-07-14
4
1,004 Views
Last Modified: 2013-12-21
I am writing an application to run on a number of different
flavours of UNIX (Solaris, SunOS, HPUX, DEC Ultrix & OSF/1,
IRIX) which writes to disk. I would like to be able to trap
disk full errors, prompt the user to make more space and
then resume operation as if nothing had happened.

I am envisaging a solution at the fwrite/fclose/fputc level
to work with both buffered and unbuffered I/O, over NFS.
Although far from trivial, it is the mechanics of the disk
writing that I am interested in and not the prompting. Any
solution which also works with NT would be a real bonus.
0
Comment
Question by:braveheart
  • 2
4 Comments
 

Expert Comment

by:AndrewW
ID: 2006531
It's been a while since I've done any C programming, but this
might work

meta code. . .

if(!fwrite)
{
      string = strerror(errno);
      if(strcmp(string, "No space left on device"))
      {
            print warning
      }
}

I don't know what the string that perror will return exactly
but under IRIX the err number for ENOSPC is 28.  You can
also check the value of errno.  Pull a manpage on
strerror and errno for more details, that should get you started.


basically errno is set anytime a syscall fails, such as writing
to disk.  you can look in /usr/include/sys/errno.h to see what
kind of errors are set.
0
 
LVL 3

Author Comment

by:braveheart
ID: 2006532
The problem is rather more difficult than that. Unfortunately the method usggested by AndrewW is far from foolproof.

For a start it would obviously be easier to test errno itself than to convert it to a string and then compare the string.

Worse still, errno is defined by ANSI to only include EDOM and ERANGE - the rest are POSIX definitions. However, POSIX seems to define ENOSPC as used by link, mkdir, mkfifo, open, rename and write but not fwrite. I have myself observed that under certain circumstances errno is set and under others it is not. The precise behaviour changes from machine to machine.

The only sure way appears to be checking the number of bytes written by the call but that only tells you that there was a failure and not why it failed.

Then there is the problem of writes being buffered up until the close, but it is not possible to recover from a close so we must flush first. Then there are the effects of NFS buffering to take into account...

The code we have developed is already quite complex and, so far, only partially works on HPUX. Ideally I would like someone to either post some tried and tested code or point me to some.
0
 
LVL 1

Accepted Solution

by:
wex earned 200 total points
ID: 2006533
I think you're hosed.  There's no way to tell a priori that in fact the problem is disk full.  In particular, the OS can't distinguish that condition from (for example) a situation in which the NFS-mounted partition onto which you're writing went away.  The OS also can't always tell if the disk is truly full -- there are a huge number of ways to implement so-called "soft" partitions which report full at some level(s) but in fact allow writes to (temporarily) overflow into other partitions.

I believe that you're correct in saying that the only way to know if a write went through is to compare the size of what was written vs what you think should've been written, but "graceful" failure in all these cases is impossible.
0
 
LVL 3

Author Comment

by:braveheart
ID: 2006534
Come on guys. I am increasing the value to try and get an answer
(if I can work out how).

Perhaps an alternative approach such as checking the amount of
available disk space would be more appropriate. This could be
performed occasionally when there is plenty of space but, as the
disk fills up, could be performed more and more frequently until
a check is performed for every write.

Does anyone have any portable code that can check for available
disk space?
0

Featured Post

Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java performance on Solaris - Managing CPUs There are various resource controls in operating system which directly/indirectly influence the performance of application. one of the most important resource controls is "CPU".   In a multithreaded…
FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.

785 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question