Is there a way in c++ to quickly invalidate the L2 cache of a processor other than iterating through a large fake array?
I'm going to assume this is for performance testing and you want to eliminate cache effects between runs.
In that case, what you'd need to know to do this efficiently is:
- The allocation size of the L2 cache
- How many allocations there are in the L2 cache
Then it's basically a matter of touching memory allocation_size
bytes away from each other until you've flushed the cache entirely.
Context switching also often invalidates the cache - it might be faster to wait
for a millisecond and if the OS swaps you in and out, it'll likely end up clearing the cache.
You want to use a memory fence. in vc++:
void SThreadUtil::synchronizeCache()
{
_mm_mfence();
}
sorry. For flushing it should be _mm_clflush
Another possible answer is to make sure a completely different section of memory is used each time the thing that is sensitive to cache contents is run. Then none of those bits of memory will be in cache and, effectively, the cache will be flushed.
Of course, this might not be a very easily managed solution for the instruction cache, only for the data cache, and even then handling the stack will be tricky.
And I can't think of a way for this approach to work reliably for the TLB cache either, if you care about that.
And another problem this approach has is that there may still be dirty cache lines and a later run of the code will incur the cost of flushing them to main memory.