views:

206

answers:

2

I'm using JPoller to detect changes to files in a specific directory, but it's missing files because they end up with a timestamp earlier than their actual creation time. Here's how I test:

public static void main(String [] files)
{
    for (String file : files)
    {
        File f = new File(file);
        if (f.exists())
        {
            System.err.println(file + " exists");
            continue;
        }

        try
        {
            // find out the current time, I would hope to assume that the last-modified
            // time on the file will definitely be later than this
            System.out.println("-----------------------------------------");
            long time = System.currentTimeMillis();

            // create the file
            System.out.println("Creating " + file + " at " + time);
            f.createNewFile();

            // let's see what the timestamp actually is (I've only seen it <time)
            System.out.println(file + " was last modified at: " + f.lastModified());

            // well, ok, what if I explicitly set it to time?
            f.setLastModified(time);
            System.out.println("Updated modified time on " + file + " to " + time + " with actual " + f.lastModified());
        }
        catch (IOException e)
        {
            System.err.println("Unable to create file");
        }
    }
}

And here's what I get for output:

-----------------------------------------
Creating test.7 at 1272324597956
test.7 was last modified at: 1272324597000
Updated modified time on test.7 to 1272324597956 with actual 1272324597000
-----------------------------------------
Creating test.8 at 1272324597957
test.8 was last modified at: 1272324597000
Updated modified time on test.8 to 1272324597957 with actual 1272324597000
-----------------------------------------
Creating test.9 at 1272324597957
test.9 was last modified at: 1272324597000
Updated modified time on test.9 to 1272324597957 with actual 1272324597000

The result is a race condition:

  1. JPoller records time of last check as xyz...123
  2. File created at xyz...456
  3. File last-modified timestamp actually reads xyz...000
  4. JPoller looks for new/updated files with timestamp greater than xyz...123
  5. JPoller ignores newly added file because xyz...000 is less than xyz...123
  6. I pull my hair out for a while

I tried digging into the code but both lastModified() and createNewFile() eventually resolve to native calls so I'm left with little information.

For test.9, I lose 957 milliseconds. What kind of accuracy can I expect? Are my results going to vary by operating system or file system? Suggested workarounds?

NOTE: I'm currently running Linux with an XFS filesystem. I wrote a quick program in C and the stat system call shows st_mtime as truncate(xyz...000/1000).

UPDATE: I ran the same program I have above on Windows 7 with NTFS and it does maintain full millisecond accuracy. The MSDN link @mdma provided further notes that FAT filesystems is accurate for creates with 10 ms resolution but for access is only accurate to 2 seconds. Thus, this is truly OS dependent.

+2  A: 

The last-modified timestamp is apparently stored in seconds, not in milliseconds. That might be a file system limitation. I'd suggest to compare it against seconds rather than milliseconds.

BalusC
Linux doesn't have a creation timestamp per se; `st_mtime` is the last time the inode was modified.
Gabe
@Gabe: I used the wrong wording, fixed.
BalusC
It's not a filesystem limitation; it's probably a limitation of the API. XFS has supported millisecond timestamps for years, but the Java library function probably doesn't know how to extract it.
Gabe
I misspoke; `st_mtime` is actually when the data of the file changed. `st_ctime` is when the inode changed.
Gabe
@Gabe - that's interesting. `stat` (man 1) gives fractional seconds on `st_ctime`, but not on `st_mtime`. Do you know what system call would allow me to retrieve the exact value?
Kaleb Pederson
+1  A: 

Filesystems do not store time precisely, and often not at millisecond resolution, e.g. FAT has a 2-second resolution for creation time, and NTFS can delay updating the last access time by up to an hour. (details on MSDN.) Although not in your case, in general, there is also the problem of synchronizing clocks if the file is created on another computer.

It seems this might be an issue for the JPoller folks, since this is where the time handling logic is. Until it's fixed, you could workaround this by manually setting the last modified time of each file written to be +4 seconds from the actual time - +4 is an arbitrary value that should be larger than the resolution of the filesystem you are working on. When the files are written to the file system, they will be rounded down, but by less than the value you have added. Not pretty, but it will work!

mdma
Nice link. It looks like NTFS will give me better resolution (up to 100 nanosecond). Sadly, that implies this is definitely OS dependent and that implementing a cross-platform fix won't be terribly simple.
Kaleb Pederson