I'm doing a library that makes extensive use of a thread local variable. Can you point to some benchmarks that test the performances of the different ways to get thread local variables in C++:
- C++0x thread_local variables
- compiler extension (Gcc __thread, ...)
- boost::threads_specific_ptr
- pthread
- Windows
- ...
Does C++0x thread_local performs much better on the compilers providing it?