Just want to document the answer to this specific question... a similar question (with potential answers was asked here)
All platforms welcome, please specify the platform for your answer.
Just want to document the answer to this specific question... a similar question (with potential answers was asked here)
All platforms welcome, please specify the platform for your answer.
On x86, you can use the CPUID instruction with function 2 to determine various properties of the cache and the TLB. Parsing the output of function 2 is somewhat complicated, so I'll refer you to section 3.1.3 of the Intel Processor Identification and the CPUID Instruction (PDF).
To get this data from C/C++ code, you'll need to use inline assembly, compiler intrinsics, or call an external assembly function to perform the CPUID instruction.
You can also try to do it programmatically by measuring some timing. Obviously, it won't always be as precise as cpuid and the likes, but it is more portable. ATLAS does it at its configuration stage, you may want to look at it:
On Linux (with a reasonably recent kernel), you can get this information out of /sys:
/sys/devices/system/cpu/cpu0/cache/
This directory has a subdirectory for each level of cache. Each of those directories contains the following files:
coherency_line_size
level
number_of_sets
physical_line_partition
shared_cpu_list
shared_cpu_map
size
type
ways_of_associativity
This gives you more information about the cache then you'd ever hope to know, including the cacheline size as well as what CPUs share this cache. This is very useful if you are doing multithreaded programming with shared data (you'll get better results if the threads sharing data are also sharing a cache).
from http://blogs.msdn.com/oldnewthing/archive/2009/12/08/9933836.aspx
The GetLogicalProcessorInformation function will give you characteristics of the logical processors in use by the system. You can walk the SYSTEM_LOGICAL_PROCESSOR_INFORMATION returned by the function looking for entries of type RelationCache. Each such entry contains a ProcessorMask which tells you which processor(s) the entry applies to, and in the CACHE_DESCRIPTOR, it tells you what type of cache is being described and how big the cache line is for that cache.
On Linux look at sysconf(3).
sysconf (_SC_LEVEL1_DCACHE_LINESIZE)
You can also get it from the command line using getconf:
$ getconf LEVEL1_DCACHE_LINESIZE
64
I have been working on some cache line stuff and needed to write a cross-platform function. I posted it to my new blog here, or you can just use the source below. Feel free to do whatever you want with it.
#ifndef GET_CACHE_LINE_SIZE_H_INCLUDED
#define GET_CACHE_LINE_SIZE_H_INCLUDED
// Author: Nick Strupat
// Date: October 29, 2010
// Returns the cache line size (in bytes) of the processor, or 0 on failure
#include <stddef.h>
size_t cache_line_size();
#if defined(__APPLE__)
#include <sys/sysctl.h>
size_t cache_line_size() {
size_t line_size = 0;
size_t sizeof_line_size = sizeof(line_size);
sysctlbyname("hw.cachelinesize", &line_size, &sizeof_line_size, 0, 0);
return line_size;
}
#elif defined(_WIN32)
#include <stdlib.h>
#include <windows.h>
size_t cache_line_size() {
size_t line_size = 0;
DWORD buffer_size = 0;
DWORD i = 0;
SYSTEM_LOGICAL_PROCESSOR_INFORMATION * buffer = 0;
GetLogicalProcessorInformation(0, &buffer_size);
buffer = (SYSTEM_LOGICAL_PROCESSOR_INFORMATION *)malloc(buffer_size);
GetLogicalProcessorInformation(&buffer[0], &buffer_size);
for (i = 0; i != buffer_size / sizeof(SYSTEM_LOGICAL_PROCESSOR_INFORMATION); ++i) {
if (buffer[i].Relationship == RelationCache && buffer[i].Cache.Level == 1) {
line_size = buffer[i].Cache.LineSize;
break;
}
}
free(buffer);
return line_size;
}
#elif defined(linux)
#include <stdio.h>
size_t cache_line_size() {
FILE * p = 0;
p = fopen("/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size", "r");
unsigned int i = 0;
if (p) {
fscanf(p, "%d", &i);
fclose(p);
}
return i;
}
#else
#error Unrecognized platform
#endif
#endif