views:

384

answers:

5

In APUE section 8.3 fork function, about file sharing between parent and child processes,
It said: It is important that the parent and the child share the same file offset.

And in section 8.9 Race Conditions, there is a example: both parent and child write to
a file which is opened before invoking fork function. The program contains a race condition,
because the output depends on the order in which the processes are run by the kernel and for how long each process runs.

But in my test code, the output are overlapped.

[Langzi@Freedom apue]$ cat race.out
this is a long long outputhis is a long long output from parent

It seems the parent and child have separate file offsets instead of sharing the same offset.

Is there any error in my code? Or did I misunderstand the meaning of sharing offset?
Any advice and help will be appreciated.

following is my code:

#include "apue.h"
#include <fcntl.h>

void charatatime(int fd, char *);

int main()
{
 pid_t pid;
 int fd;
 if ((fd = open("race.out", (O_WRONLY | O_CREAT |  O_TRUNC),
     S_IRUSR | S_IWUSR)) < 0)
  err_sys("open error");

 if ((pid = fork()) < 0)
  err_sys("fork error");
 else if (pid == 0)
  charatatime(fd, "this is a long long output from child\n");
 else
  charatatime(fd, "this is a long long output from parent\n");

 exit(0);
}


void charatatime(int fd, char *str)
{
 // try to make the two processes switch as often as possible
 // to demonstrate the race condition.
 // set synchronous flag for fd
 set_fl(fd, O_SYNC);
 while (*str) {
  write(fd, str++, 1);
  // make sure the data is write to disk
  fdatasync(fd);
 }
}
A: 

If I recall correctly from my OS class, forking does give the child it's own offset (though it starts at the same position as the parents), it just keeps the same open file table. Although, most of what I'm reading seems to state otherwise.

Nali4Freedom
In manual page of open, it said the file offset and the file status flags are stored in 'open file description'.Is this open file description the same as open file table you mentioned above?
OnTheEasiestWay
I believe it is. In UNIX, each process keeps it's own file descriptor table, and the kernel has it's own table of open files (and things like a count of how many links there are too that file). So I'm guessing this is up to the implementation whether the OS stores the offset in the processors open file table, or in the kernel. Perhaps the original UNIX standard had it like APUE says, while more modern implementations give each process it's own offset.
Nali4Freedom
No, in modern implementations the offset stays in the kernel file table.
Omnifarious
Modern implementations, the original implementation, and every correct implementation. A lot of unix *depends* on global fds. It's libc-level `FILE *`s that are local to a process, but they just proxy for file descriptors.
hobbs
+1  A: 

Well, I was wrong.

So, they are sharing an offset, but something else weird is going on. If they weren't sharing an offset you would get output that looked like this:

this is a long long output from chredt

because each would start writing at it's own offset 0 and advancing a character at a time. They wouldn't start conflicting about what to write to the file until the got to the last word of the sentence, which would end up interleaved.

So, they are sharing an offset.

But the weird thing is, it doesn't seem like the offset is getting atomically updated because neither processes output appears in full. It's like some parts of one are overwriting some parts of the other, even though they also advance the offset so that always doesn't happen.

If the offset weren't being shared, you would end up with exactly as many bytes in the file as the longest of the two strings.

If the offsets are shared and updated atomically, you end up with exactly as many bytes in the file as both strings put together.

But you end up with a number of bytes in the file that's somewhere in between, and that implies the offesets are shared and not updated atomically, and that's just plain weird. But that apparently is what happens. How bizarre.

  1. process A reads offset into A.offset
  2. process B reads offset into B.offset
  3. process A writes byte at A.offset
  4. process A sets offset = A.offset + 1
  5. process B writes byte at B.offset
  6. process A reads offset into A.offset
  7. process B sets offset = B.offset + 1
  8. process A writes byte at A.offset
  9. process A sets offset = A.offset + 1
  10. process B reads offset into B.offset
  11. process B writes byte at B.offset
  12. process B sets offset = B.offset + 1

That's approximately what the sequence of events must be. How very strange.

The pread and pwrite system calls exist so two processes can update a file at a particular position without racing over who's value of the global offset wins.

Omnifarious
Thanks for your answer, it seems clear to me now.The offset should be shared as the file size is larger than the longer string that the two processes write to the file.And I tried to use O_APPEND flag to open the file, and the result is as expected. The difference is the O_APPEND flag make the following two steps as a atomic operation:1. set the offset to the new file size, even the file size is changed.2. write to the file.As there is no race in the two processes, so the result is correct.
OnTheEasiestWay
But with O_APPEND flag, the result should be correct for any two processes which write to the same file, no matter the two processesare parent and child relationship or not. If the two processes open the file twice, the result is also correct.The O_APPEND flag is like pread and pwrite system call, all of they are atomic operations.As the race condition exists, so we have to use some form of signaling as the APUE said.
OnTheEasiestWay
O_APPEND is more like atomically doing an lseek(fd, 0, SEEK_END) and a write every time you write to the file.
Omnifarious
A: 

Well, I adjusted the code to compile on vanilla GCC/glibc, and here's an example output:

thhis isias a l long oulout futput frd
 parent

And I think that supports the idea that the file position is shared and it is subject to a race, which is why it's so weird. Notice that the data I showed has 47 characters. That's more than the 38 or 39 characters of either single message, and less than the 77 characters of both messages together -- the only way I can see that happening is if the processes sometimes race to update the file position -- they each write a character, they each try to increment the position, but because of the race only one increment happens and some characters get overwritten.

Supporting evidence: man 2 lseek on my system says clearly

Note that file descriptors created by dup(2) or fork(2) share the current file position pointer, so seeking on such files may be subject to race conditions.

hobbs
Thanks for your answer. It makes me understand the issue clearly.
OnTheEasiestWay
A: 

Parent and child share the same file table entry in the kernel, which includes the offset. It is impossible, then, for the parent and child to have different offsets without one or both of the processes closing and re-opening the file. So, any write by the parent uses this offset and modifies (increments) the offset. Then any write by the child uses the new offset, and modifies it. Writing a single character at a time aggravates this situation.

From my write(2) man page: "The adjustment of the file offset and the write operation are performed as an atomic step."

So, from that, you can be guaranteed that no write from one (parent or child) will write over top of the other's. You can also note that if you were to write(2) your whole sentence at once (in one call to write(2)), it is guaranteed that the sentence will be written together, in one piece.

In practice, many systems write log files this way. Many related processes (children of the same parent) will have a file descriptor that was opened by the parent. As long as each of them write a whole line at a time (with one call to write(2)), the log file will read as you would want it to. Writing a character at a time will not have the same guarantees. Use of output buffering (with, say, stdio) will similarly remove the guarantees.

Rob F
I think the following is true only for file opened with O_APPEND flag, >From my write(2) man page: "The adjustment of the file offset and the >write operation are performed as an atomic step." As the output shows, there is no guarantee that the write is atomic operation.
OnTheEasiestWay
What I quoted from write(2) is true whether O_APPEND is true or not, as long as the file is opened for writing. The write itself, and the adjustment of the file offset, are done atomically, so if the write was occurring at the end of the file, the offset is adjusted to the new end of file, and all in one operation. O_APPEND is only necessary if the file is opened independently by more than one process, and in this case causes 2 adjustments of the file offset on each write -- one before the write itself, and one after, and all atomically.
Rob F
The original question described an open(2) before a fork(2), in which case the file offset is shared by parent and child (as well as any further children of either. Once any particular process in the family closes the file, it will no longer pass it on to its children, though that will not affect its status in any of the other processes.
Rob F
Also in the original question, characters were being written one at a time -- that is, for each character written there was a separate call to write(2). One cannot anything about the atomicity of the write(2) call when only a single character is being written. Had the entire buffer/string been written in a single call to write(2) in each process, then the atomicity of the write(2) call would have been evident.
Rob F
From what you said, a)the file is opened for writing, b) the file offset is shared by parent and child, c)the write itself and the adjustment of the file offset are done atomically. Why the final result is overlapped? Maybe you miss some words in your latest comment about each character has it's own write call, so I can't understand your meaning correctly. Is it an atomic operation about writing a character with a write call? Or only writing entire buffer/string with a write call is atomic?
OnTheEasiestWay
A: 

use pwrite since write sometime end up with race condition when same resource(write()) being shared by multiple process since write doesn't leave file pos=0 after completion for example you end up at middle of file so file pointer(fd) pointing to this location and if other process wants to do something then it produces or works not as it wanted to do since file descriptor will be shared across forking!!

Try and give me feed back

Manish