views:

541

answers:

3

It is possible for an operating system to determine whether a page of memory is in DRAM or in swap; for example, simply try to access it and if a page fault occurs, it wasn't.

However, is the same thing possible with CPU cache?

Is there any efficient way to tell whether a given memory location has been loaded into a cache line, or to know when it does so?

+3  A: 

In general, I don't think this is possible. It works for DRAM and the pagefile since that is an OS managed resource, cache is managed by the CPU itself.

The OS could do a tight timing loop of a memory read and try to see if it completes fast enough to be in the cache or if it had to go out to main memory - this would be very error prone.

On multi-core/multi-proc systems, there are cache coherency protocols that are used between processors to determine when to they need to invalidate each other's caches, I suppose you could have a custom device that would snoop this protocol that the OS would query.

What are you trying to do? If you want to force something into memory, current x86 processors support prefetching memory into the cache in a non-blocking way, for instance with Visual C++ you could use _mm_prefetch to fetch a line into the cache.

EDIT: I haven't done this myself, so use at your own risk. To determine cache misses for profiling, you may be able to use some architecture-specific registers. http://download.intel.com/design/processor/manuals/253669.pdf, Appendix A gives "Performance Tuning Events". This can't be used to determine if an individual address is in the cache or when it is loaded in the cache, but can be used for overall stats. I believe this is what vTune (a phenomenal profiler for this level) uses.

Michael
Thanks. I'm interested since I'm toying around with writing kernels. I'm interested in profiling cache line misses on the actual hardware. I didn't realise how very detrimental they are on modern CPUs until I saw Herb Sutter's slides: http://is.gd/oWwp
There are ways to profile this in the hardware, vtune does.
Michael
A lot of modern CPUs have performance counters which can provide all kinds of information, including cache related statistics.
sigjuice
Now that sounds interesting. Do you have any references?
Wow, there's a wealth of them, very interesting: http://msdn.microsoft.com/en-us/library/bb385772.aspx
@mike.amy - I added a link to the Intel IA-32 manual that should detail performance events.
Michael
@mike.amy - If you're writing your own kernel, you get to grovel these events yourself. Sounds like fun :)
Michael
ah, I love it. ;)
I guess if these counters get updated rapidly enough, and are easily accessible, the code could tell by the counter incrementing during a load.
@mike.amy - Alnitak mentions in another answer, you can't assume that a cache miss is due to the data you're reading. It could be to bring in the code to check for the cache miss, it could be the processor reading PTE's to fill the TLB on a TLB miss, etc., etc.
Michael
Looking through that Intel manual was fascinating. Turns out there are counters for all sort of cache misses, pretty much everything you could ask for on modern x86 architecture (look in Appendix A). Alnitak's point is good, but I'll accept this answer cos it lead to me to the right place. Thanks.
+5  A: 

If you try to determine this yourself then the very act of running your program could invalidate the relevant cache lines, hence rendering your measurements useless.

This is one of those cases that mirrors the scientific principle that you cannot measure something without affecting that which you are measuring.

Alnitak
Well, it's very easy to verify that a location *is* in the cache. Just read from it, and voila, it'll be in the cache. ;)The trick is in testing whether something is *not* in the cache. :)
jalf
If you ensure that your code is running from uncacheable memory, the mere act of running a program wouldn't affect the cache.
Nathan Fellman
Good points. So basically I'd have to account for everything else that might happen in order to know whether or not a cache miss loaded the memory in question.
A: 

X86 dont know how to tell if address IS in cache BUT here is how to tell if address WAS in cache

rdtsc
save timestamp
mov eax,address
rdtsc read timestamp counter
calculate timestamp difference
if < threshold then was in cache

threshold has to be determined from documentation or empirically

some machines have cache hit/miss counters which would serve equally well