views:

38

answers:

2

I have a very simple piece of code which just writes a small amount of data to a file at some regular interval. Once my program has created the file and appended some data, when I open this file in vim(or any other editor for that matter) and edit it, my process cannot seem to update the file anymore. I do not see any errors being returned from the syscall. I tried tracing the system calls, and did not observe anything weird even while the file is NOT being updated.

Since each process gets its own file table entry which has the current offset, all I was expecting was an output file with data interspersed with writes from the two non-cooperating processes(possibly garbled too). But what I am observing is that my program cannot update the file anymore once any other editor writes to the file.

Couple of other interesting observations

1) When I cat something to the output file, my program can continue to update no problem

2) When multiple instances of my own program are writing to the same file, everything is fine again

I understand that there's mandatory locking to prevent multiple writes, but I am trying to understand whats happening underneath. Also this kind of scenario behaves normally for some loggers (like system log, apache logs etc)

Any ideas to explain this behavior?. Also any hints on how I can debug this further?

My code is pretty simple:

  1 int main(int argc, char** argv)
  2 {
  3     const char* buf;
  4     if(argc < 2)
  5         buf = "test->";
  6     else
  7         buf = argv[1];
  8 
  9     int fd; 
 10     if((fd = open("test.log", O_CREAT|O_WRONLY|O_APPEND, 0644)) == -1) {
 11         perror("Cannot open test.log");
 12         exit(1);
 13     }   
 14 
 15     int num_bytes = strlen(buf), num_bytes_written = -1; 
 16 
 17     while(1) {
 18         if((num_bytes_written = write(fd, buf, num_bytes)) == -1) {
 19             perror("Could not write to fd");
 20         }   
 21         fsync(fd);
 22         sleep(5);
 23     }   
 24 }   
A: 

Your vim editor works on a cached version of your file. It modifies this cache while your other programs append to the original file. When you save with vim, you overwrite the original file with the updated cached file and loose all logs.

mouviciel
vim opens a .swp file and works on it, overwrites the original file etc. -- agreed; But what I am trying to understand is why any writes performed by my process do not end up in the file, even after vim(or any other editor) is done writing and exited. (Remember this process is in an infinite loop writing some data at regular intervals)
nooblrnr
+1  A: 

When the vim(1) editor exits, it's likely replacing the original file with the edited version. Your process is holding the original file open but that file no longer exists in the sense that it's directory entry has been replaced and, so, no process that doesn't already have the file open can access it. Your process is now appending to a file that can't be accessed by any other process. Once your process closes the file, it will be gone for good (unless you run a partition recovery program).

Steve Emmerson
`set backupcopy=yes` will make Vim overwrite the existing file instead of writing a new file and renaming it to the original name.
Chris Johnsen
@steve: Doesnt unlink only decrement the link count and the actual file is removed only when link count reaches zero ? But what is not clear is the result of writes on file-descriptors that are still open, even though the file has been marked for unlinking. Like you said, may be all the data is dropped on the floor, the moment a file is marked for unlinking? I guess write syscall doesnt check link count(probably for performance) and so cannot report an error. Can anybody confirm this reasoning?
nooblrnr
@chris: good to know this. But my question is not really about vim; It could have been any editor. I saw same behavior with Textmate on mac
nooblrnr
@nooblrnr: Textmate probably has file handling that is similar to that of Vim: “unlink, then create”, “create temp, then rename”, or “rename, create”, etc.. Any of those variations would leave your logging processes writing into the old file (possibly now unlinked) while data that the editor wrote would only be in the new file (which just happens to have the same pathname as the original filename).
Chris Johnsen
@nooblrnr: See the “If one or more processes have the file open” in the POISX descriptions of [unlink(2)](http://www.opengroup.org/onlinepubs/000095399/functions/unlink.html) and [rename(2)](http://www.opengroup.org/onlinepubs/000095399/functions/rename.html). The inode link count and the open count are not the same. The inode link count lives on disk. The open count lives in the kernel’s memory (more or less). They both have to be zero before the data is actually removed. If the link count is 0 and the open count > 0 then the file is anonymous and “private” (or “shared” if open count > 1).
Chris Johnsen
@nooblrnr Yes. A file is deleted when it's hard-link count becomes zero. In the given scenario, however, there's no reason to suspect the hard-link count is anything other than zero.
Steve Emmerson
@chris: Thanks for the clear answer. So when the file becomes anonymous/private, its contents are deleted when the process holding the fd, closes the file, right? I guess this is what Steve meant in his comment when he said "Once your process closes the file, it will be gone for good (unless you run a partition recovery program)"
nooblrnr
@Steve: Yes, you are right. I was incorrectly assuming that some how having the open-count > 0 would change the working of unlinking. Since mandatory locking has no effect on unlink, the only way for my process is to do what the editors do i.e., write to some temporary file, "merge" differences if there were any ("merge" can be trivial or extremely complicated depending on the data). Thanks for the response.
nooblrnr