views:

781

answers:

9

Hi!

is there a way in C++ to determine the CPU's cache size? i have an algorithm that processes a lot of data and i'd like to break this data down into chunks such that they fit into the cache. Is this possible? Can you give me any other hints on programming with cache-size in mind (especially in regard to multithreaded/multicore data processing)?

Thanks!

+2  A: 

read the cpuid of the cpu (x86) and then determine the cache-size by a look-up-table. The table has to be filled with the cache sizes the manufacturer of the cpu publishes in its programming manuals.

Tobias Langner
hey that sounds interesting! are there any such precomposed tables available online?
Mat
http://www.x86-guide.com/en/index.html Might have such tables. However, the problem with this is what do you do with an unidentified cpu and do you want to have to update the program every time a new cpu comes out?
Robert S. Barnes
Doesn't this solution break down though if your program is used on a CPU released after your program is released?
Billy ONeal
+11  A: 

This a copy of my answer to another question, but here goes:

Here's a link to a really good paper on caches/memory optimization by Christer Ericsson (of God of War I/II/III fame).

It's a couple of years old but it's still very relevant.

Andreas Brinck
+1 Paper looks good
Tom
Link to the other question?
dmckee
+6  A: 

C++ itself doesn't "care" about CPU caches, so there's no support for querying cache-sizes built into the language. If you are developing for Windows, then there's the GetLogicalProcessorInformation()-function, which can be used to query information about the CPU caches.

kusma
A: 

Raymond Chen's excellent blog touched on the subject recently.

lhenrygr
+2  A: 

Depending on what you're trying to do, you might also leave it to some library. Since you mention multicore processing, you might want to have a look at Intel Threading Building Blocks.

TBB includes cache aware memory allocators. More specifically, check cache_aligned_allocator (in the reference documentation, I couldn't find any direct link).

Steph
A: 

The cache will usually do the right thing. The only real worry for normal programmer is false sharing, and you can't take care of that at runtime because it requires compiler directives.

Charles Eli Cheese
A: 

According to "What every programmer should know about memory", by Ulrich Drepper you can do the following on Linux:

Once we have a formula for the memory requirement we can compare it with the cache size. As mentioned before, the cache might be shared with multiple other cores. Currently {There definitely will sometime soon be a better way!} the only way to get correct information without hardcoding knowledge is through the /sys filesystem. In Table 5.2 we have seen the what the kernel publishes about the hardware. A program has to find the directory:

/sys/devices/system/cpu/cpu*/cache

This is listed in Section 6: What Programmers Can Do.

He also describes a short test right under Figure 6.5 which can be used to determine L1D cache size if you can't get it from the OS.

There is one more thing I ran across in his paper: sysconf(_SC_LEVEL2_CACHE_SIZE) is a system call on Linux which is supposed to return the L2 cache size although it doesn't seem to be well documented.

Robert S. Barnes
A: 

Preallocate a large array. Then access each element sequentially and record the time for each access. Ideally there will be a jump in access time when cache miss occurs. Then you can calculate your L1 Cache. It might not work but worth trying.

ben
A: 

Interestingly enough, I wrote a program to do this awhile ago (in C though, but I'm sure it will be easy to incorporate in C++ code).

http://github.com/wowus/CacheLineDetection/blob/master/Cache%20Line%20Detection/cache.c

The get_cache_line function is the interesting one, which returns the location of right before the biggest spike in timing data of array accesses. It correctly guessed on my machine! If anything else, it can help you make your own.

It's based off of this article, which originally piqued my interest: http://igoro.com/archive/gallery-of-processor-cache-effects/

Clark Gaebel