views:

79

answers:

1

Dear all,

we are receiving about 10000 messages per hour. We store them as individual files in hourly directories on an ext3 filesystem. The file name includes a sequence number. We use rsync to mirror these files every 20 seconds at another location (via a SAN, but that doesn't matter).

Sometimes an rsync run picks up files n-3, n-2, n-1, n+1, and then next rsync run continues with n, n+2, n+3, n+4 and so on.

Is it possible that when one process creates files in a certain sequence within a directory, that another process using readdir() sees the files appearing in a different sequence?

Kind regards, Sebastian

+1  A: 

I suppose your question can be restated as:

If process A creates file d/x and then creates file d/y, is it possible for process B to peform a concurrent readdir() on directory d and see an entry d/y, but not see an entry d/x?

The answer is Yes. The ordering guarantees for readdir are very weak indeed.

If you want to enforce an ordering, you will need to explicitly fsync() a file descriptor for the directory d itself after creating each file.

caf
@caf: Thanks for the answer. That's weak indeed, and I'm still inclined to call it a bug. Do you have any pointers to Linux documentation or source code? Btw, we have now disabled the ext3 dir_index option, and whereas in the last month the issue occured a few times nearly every day, it has now not happened anymore for three days in a row.
Wangnick
@caf: I'm still wondering whether and how fsync could avoid this from happening.
Wangnick
@Wangnick: The relevant documentation is the POSIX spec for `readdir()`, which simply says *If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is unspecified.* - note, no ordering guarantees. If process A creates `d/x`, then calls `fsync(d)`, then creates `d/y`, then calls `fsync(d)` (etc) then you should get the externally visible ordering you desire.
caf