tags:

views:

38

answers:

2

I am writing a multi-threaded program in C where one core periodically grabs an item from the head of a linked list while other cores append items to the back of the list (using CAS magic for thread safety, someone else provided that for me). It appears that my program will run faster if the core taking an item from the head of the list simply initiates a prefetch for the next item, which is sure to be in another core's cache.

Currently I am targeting an AMD Opteron 6168, compiling with gcc on Debian Linux: I've tried to find documentation for this but I am in unfamiliar waters. All I can find is using -O3 to enable compiler-inserted prefetching (I think for loops) and some mentions of the AMD prefetch instruction names like PREFETCHW.

I do not know how to find the reference for what I'm after, or how to insert a statement like that into C, maybe as a block of assembly?

+2  A: 

Check the intel architecture docs.

In VC, you should be able to do something like this

asm
{
  prefetch POINTER_NAME
}

In GCC -

asm("prefetch %0", POINTER_NAME); //May have syntax slightly off

I've looked this over before.

Paul Nathan
The GCC version probably needs to be `asm volatile("prefetch %0", : "r" POINTER_NAME);` This will load the pointer value into an available register and then issue the instruction with that register as its argument. The assembly string is an instruction template and the %0 is replaced with the 0th argument. I'm not 100% on this, though.
nategoose
@nate: yeah, I'm not swearing by my gcc inline syntax. It regularly infuriates me and requires me to do lookup on manuals - gimme VS syntax any day.
Paul Nathan
@Paul: The GCC syntax allows for the optimization passes to treat the inlined assembly much more like the code that it generates itself, but it is not at all intuitive.
nategoose
Yours answers led me to a snippet of Linux kernel code that demonstrates the syntax I'm looking for, thanks for your help static inline void prefetch(void *x) { asm volatile("prefetcht0 %0" :: "m" (*(unsigned long *)x)); }
dim fish
+2  A: 

gcc comes with some builtin functions for that. You can do

__builtin_prefetch(&yourData);
nos
wow, cool, this is even nicer!
dim fish
Oh hey, that's cool!
Paul Nathan