tags:

views:

226

answers:

4

I'm writing a program, some kind of database. While I was reading manual of fclose(3) I found that it calls fflush(3) to flush FILE* buffers to disk (actually to OS buffer, but it doesn't matter right now, we can always call fsync(2)).

Because I'm writing a DB it is obvious that I want to prevent data loss. If there is no disk space and fflush(3) in fclose(3) fails — we will lose our data, because

using FILE* after an error in fclose() will cause undefined behavior

So I thought about explicit use of fflush(3) before fclose(3), warn user about low disk space and recall fflush(3) after a while.

I've read the C standard and thought this was a good idea. In practice, after failed fflush the second call would always return 0 (no error), but would actually do nothing. fsync didn't help me (I thought data might be saved in RAM).

How can I prevent data loss in such a situation? Maybe there are some rules of thumb.

Here is my test code:

#include <stdio.h>
int main()
{
    FILE *a = fopen("/tmp/1", "wb")
    if ( !a )
        perror("fopen");

    if ( fwrite("test", 1, 4, a) != 4 )
        perror("fwrite");  // always OK, cause data is buffered


    while( fflush(a) )  // ...second call will always return 0!
    {
        perror("fflush");  // if there is no disk space, I will get this perror, but ...
    }


    if ( fclose(a) )  // always ok, because calls only close(2)
        perror("fclose"); 

    return 0;
}
A: 

You could fseek(3) to the end of the file (assuming you'd know the length) before doing anything. That way you'd eliminate the possibility of failure due to insufficient disk space.

Steve Emmerson
Many filesystems allow sparse files so seeking does not actually allocate storage.
R Samuel Klatchko
lseek() doesn't actually allocate any space. You would need instead to use fallocate() on file systems that support it.
pixelbeat
fallocate() is REALLY slow on Cygwin and Microsoft's _chsize_s too.I'm preallocating disk space by writing 8M blocks with fwrite — it's 3 times faster.
f0b0s
+1  A: 

fflush will only flush the the C Library internal buffers to the OS, so an fflush won't guarantee that there won't be data loss.

Calling fflush repeatedly (without intermediate fwrites) won't help, as you have already flushed the data to the OS once. The second fflush call will return SUCCESS as there is nothing to flush to the OS. If fflush fails because hard disk is full, you have already lost some data.

To flush the data to the disk, you need to use fsync.

If the hard disk is full, you are out of luck. The only way to prevent data loss is to now keep your process alive (and data in memory: either in user space/kernel file buffers) till you find some space on the disk to fsync to. Now if the power goes out, you will lose the data.

In short, there is no way you can guarantee no data loss if your hard disk is full.

Moron
I've made a test. fflushed to full disk, then cleaned it and called fsync() — test failed, I've lost data.
f0b0s
I meant, there is no way you can guaranteed _no data loss_ if your hard disk is full. fsync will only work if there is space to flush to. It is like trying to add more water to an already filled glass. If fflush fails, you have lost data right there. So whether fsync fails or not is a different issue.
Moron
+2  A: 

You could preallocate some reasonable amount of disk space. Write, flush, and fsync some binary zeros (or whatever) and then seek back to where you were. Rinse and repeat when necessary. And remember to truncate if necessary.

A bit of a pain but it should work.

Duck
+3  A: 

The reason the subsequent fflush() operations succeed is that there is no (new) data to write to disk. The first fflush() failed; that is tragic but history. The subsequent fflush() has nothing to do, so it does so successfully.

If you are writing to a database, you have to be careful about each write - not just dealing with problems at the end. Depending on how critical your data is, you may need to go through all sorts of gyrations to deal with problems - there are reasons why DBMS are complex, and failed writes are one of them.

One way of dealing with the problem is to pre-allocate the space for the data. As others have noted, classic Unix file systems allow for sparse files (files where there are empty blocks with no disk space allocated for them), so you actually have to write some data onto each page that you need allocated. Then you only have to worry about 'disk full' problems when you extend the space - and you know when you do that and you can deal with that failure carefully.

On Unix-based systems, there are a variety of system calls that can help you synchronize your data on disk, and options to 'open' etc. These include the 'O_DSYNC' and related values. However, if you are extending a file, they can still cause failures for 'out of space', even with the fancy synchronizing options. And when you do run into that failure, you have to wait for space to become available (because you asked the user to tell you when it is available, perhaps), and then try the write again.

Jonathan Leffler