I know this is platform-specific question, however, I would like to do some run-time analysis of an application to detect cache misses and hits. I know cachegrind, a tool for valgrind, and of vtune, and that a slew of other profiling utilities exist. However, I am interested, in implementing my own version of cache-miss detection. I know cachegrind acts as a cache-simulator. Without hacking apart the kernel, how can I detect a cache-miss pragmatically? I have a feeling this is nearly impossible for a user-land application, but I had to ask anyways.
How can I detect a cache-miss pragmatically [without cache simulation]?
Caches are managed by hardware - not the kernel. Their parameters (levels of cache, size, kick-out policy, write-back/write-through, etc.) are all processor implementation-specific. As a programmer you're "not supposed to know they exist". Thus, to measure cache-miss performance without cache simulation is impossible.
On the other hand VM pages (a much coarser "cache" - in the sense that a cache holds chunks of memory) are managed by the OS. I imagine there would be ways to gather statistics about page faults by hacking at the kernel or even creating a nifty user application. Page fault statistics may not be that much of use to you (especially since they're affected by other running processes), but an application using large amounts of RAM might (a teeny, tiny bit) have page-fault or page-access patterns similar to CPU cache access patterns. However, I'm not so sure about the details.