tags:

views:

91

answers:

2

I'm trying to implement a simple method to read new lines from a log file each time the method is called.

I've looked at the various suggestions both on stackoverflow (e.g. here) and elsewhere for simulating "tail" functionality; most involve using readline() to read in new lines as they're appended to the file. It should be simple enough, but can't get it to work properly on OS X 10.6.4 with the included Python 2.6.1.

To get to the heart of the problem, I tried the following:

  1. Open two terminal windows.

  2. In one, create a text file "test.log" with three lines:

    one
    two
    three
    
  3. In the other, start python and execute the following code:

    Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) 
    [GCC 4.2.1 (Apple Inc. build 5646)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os
    >>> os.stat('test.log')
    posix.stat_result(st_mode=33188, st_ino=23465217, st_dev=234881025L, st_nlink=1, st_uid=666, st_gid=20, st_size=14, st_atime=1281782739, st_mtime=1281782738, st_ctime=1281782738)
    >>> log = open('test.log')
    >>> log.tell()
    0
    >>> log.seek(0,2)
    >>> log.tell()
    14
    >>> 
    

    So we see with the tell() that seek(0,2) brought us to the end of the file as reported by os.stat(), byte 14.

  4. In the first shell, add another two lines to "test.log" so that it looks like this:

    one
    two
    three
    four
    five
    
  5. Go back to the second shell, and execute the following code:

    >>> os.stat('test.log')
    posix.stat_result(st_mode=33188, st_ino=23465260, st_dev=234881025L, st_nlink=1, st_uid=666, st_gid=20, st_size=24, st_atime=1281783089, st_mtime=1281783088, st_ctime=1281783088)
    >>> log.seek(0,2)
    >>> log.tell()
    14
    >>> 
    

Here we see from os.stat() that the file's size is now 24 bytes, but seeking to the end of the file somehow still points to byte 14?? I've tried the same on Ubuntu with Python 2.5 and it works as I expect. I tried with 2.5 on my Mac, but got the same results as with 2.6.

I must be missing something fundamental here. Any ideas?

+3  A: 

How are you adding two more lines to the file?

Most text editors will go through operations a lot like this:

fd = open(filename, read)
file_data = read(fd)
close(fd)
/* you edit your file, and save it */
unlink(filename)
fd = open(filename, write, create)
write(fd, file_data)

The file is different. (Check it with ls -li; the inode number will change for almost every text editor.)

If you append to the log file using your shell's >> redirection, it'll work exactly as it should:

$ echo one >> test.log
$ echo two >> test.log
$ echo three >> test.log
$ ls -li test.log
671147 -rw-r--r-- 1 sarnold sarnold 14 2010-08-14 04:15 test.log
$ echo four >> test.log
$ ls -li test.log
671147 -rw-r--r-- 1 sarnold sarnold 19 2010-08-14 04:15 test.log

>>> log=open('test.log')
>>> log.tell()
0
>>> log.seek(0,2)
>>> log.tell()
19

$ echo five >> test.log
$ echo six >> test.log

>>> log.seek(0,2)
>>> log.tell()
28

Note that the tail(1) command has an -F command line option to handle the case where the file is changed, but a file by the same name exists. (Great for watching log files that might be periodically rotated.)

sarnold
I totally missed that - thanks a lot for the clarification!
Will Harris
+2  A: 

Short answer: no, your assumptions are.

Your text editor is creating a new file with the same name, not modifying the old file in place. You can see in your stat result that the st_ino is different. If you were to do os.fstat(log.fileno()), you'd get the old size and old st_ino.

If you want to check for this in your implementation of tail, periodically compare the st_ino of the stat and fstat results. If they differ, there's a new file with the same name.

Aaron Gallagher
Yeah, I missed that the file was being recreated. Thanks!
Will Harris