views:

63

answers:

1

I wrote a multi-threaded app to benchmark the speed of running LOCK CMPXCHG (x86 ASM).

On my machine (dual Core - Core 2), with 2 threads running and accessing the same variable, I can perform about 40M ops/second.

Then I gave each thread a unique variable to operate on. Obviously this means there's no locking contention between the threads, so I expected a speed performance. However, the speed didn't change. Why?

+8  A: 

If you have 2 threads simultaneously accessing data that's on the same cache line, you get false sharing, where each core has to keep updating its cache because the same part of the cache was changed by the other core.

Make sure that the unique variables are allocated in different blocks of memory (at least 128 bytes apart, say) to make sure that this isn't the issue you're having.

DDJ has a nice article describing the horrible effects of false sharing: http://www.drdobbs.com/go-parallel/article/showArticle.jhtml?articleID=217500206

Here's Wikipedia's entry on it: http://en.wikipedia.org/wiki/False_sharing

Gabe
That explains it! Thanks :)
IanC
I tested it and it works at about double the speed now, as expected. Great!
IanC

related questions