I'm not quite sure if it's safe to spin on a volatile variable in user-mode threads, to implement a light-weight spin_lock, I looked at the tbb source code, tbb_machine.h:170,
//! Spin WHILE the value of the variable is equal to a given value
/** T and U should be comparable types. */
template<typename T, typename U>
void spin_wait_while_eq( const volatile T& location, U value ) {
atomic_backoff backoff;
while( location==value ) backoff.pause();
}
And there is no fences in atomic_backoff class as I can see. While from other user-mode spin_lock implementation, most of them use CAS (Compare and Swap).