views:

127

answers:

4

Operating systems read from disk more than what a program actually requests, because a program is likely to need nearby information in the future. In my application, when I fetch an item from disk, I would like to show an interval of information around the element. There's a trade off between how much information I request and show, and speed. However, since the OS already reads more than what I requested, accessing these bytes already in memory is free. What API can I use to find out what's in the OS caches?

Alternatively, I could use memory mapped files. In that case, the problem reduces to finding out whether a page is swapped to disk or not. Can this be done in any common OS?

EDIT: Related paper http://www.azulsystems.com/events/mspc_2008/2008_MSPC.pdf

+1  A: 

It certainly can't be done on Windows. On windows the read ahead behaviour is up to the OS, and even if it could tell you how much it had read ahead, it wouldn't do you any good because as soon as you'd found out, the in memory pages which are used for caching could have been reclaimed for some other use.

The same thing goes for determining whether a page is resident or not. As soon as you've found out the answer might change when some other thread needs the memory for something else.

If you really wanted to do thins kind of thing on Windows you can turn off buffering and manage the buffers yourself. This is the fastest IO path, but it is also the most complex - you have to be very careful, and often the OS can still do it better.

Stewart
Your point about the race condition between determining if a page is resident and actually reading it is good. However, you are at the mercy of page evictions even when not doing anything fishy like I want to do. A plain fread call can return and the memory page disappear before I use it.
Bruno Martinez
True - although that is less likely - unless you're under serious memory pressure the OS is unlikely to page out recently used private pages because it has to write them to disk. Cache pages, or memory mapped file pages, are already on the disk so they are much less costly for the OS to lose. Either way it isn't the win it appears to be - You should just read as much as you need and hope it is as free as possible.
Stewart
+2  A: 

You're starting out from a wrong presumption. At least on Linux, the OS will try to figure out the program's access patterns. If you read a file sequentially, the kernel will prefetch sequentially. If you jump around the file a lot, the kernel will probably be confused at first, but then it will stop prefetching.

So if you actually are accessing your file sequentially, you know what's probably prefetched: the next data block. If you are randomly seeking, probably nothing else in the vicinity is prefetched.

Try to approach this a different way. Before calling read() to get the information you need, call fadvise() to let the OS know what you want it to start loading..

I'm also curious to know what kind of application you're using that can run correctly by only operating on data that happens to be in the file cache by chance. I feel like we could find a good way to address your need if you posted a little more info.

Karmastan
I'm sorry to disappoint, but I have no concrete use for this. It occurred to me while reading www.cs.sunysb.edu/~bender/pub/BenderHu-TODS07.pdf and other papers on cache obliviousness.
Bruno Martinez
+1  A: 

What API can I use to find out what's in the OS caches?

There's certainly no standard way to do this for any posix system, and I not aware of any non-standard way specific to Linux. The only thing you can know (almost) for sure is that the file system will have read in a multiple of the page size, usually 4kB. So, if your reads are small, you can know with high probability (although not for sure) that the data in the surrounding page is in memory.

You could, I suppose, do tricksy things like timing how long it took a read system to complete. If it's fast, that is 100s of microseconds or less, it was probably a cache hit. Once it gets up to a millisecond or so, it was probably a cache miss. Of course, this doesn't actually help you very much, and it's very very fragile.

Please note that once the file system has copied the the data to user buffers, it is free to immediately discard the buffers holding the data from disk. It probably doesn't do this right away, but you can't tell for sure.

Finally, I second @Karmastan's suggestion: explain the broader end you're trying to achieve. There's likely a way to do it, but the one you've suggested isn't it.

Dale Hagglund
+4  A: 

You can indeed use your second method, at least on Linux. mmap() the file, then use the mincore() function to determine which pages are resident. From the man page:

int mincore(void *addr, size_t length, unsigned char *vec);

mincore() returns a vector that indicates whether pages of the calling process's virtual memory are resident in core (RAM), and so will not cause a disk access (page fault) if referenced. The kernel returns residency information about the pages starting at the address addr, and continuing for length bytes.

There's of course a race condition here - mincore() can tell you that a page is resident, but it might then be swapped out just before you access it. C'est la vie.

caf