views:

294

answers:

1

Hi, This is doing my head in.

I'm trying to implement some "lock-free" code and am using CAS (gcc __sync_val_compare_and_swap) to do he heavy lifting.

My problem can be shown with the following code.

volatile bool lock;
void *locktest( void *arg )
{
    for ( int i = 0 ; i < 100000 ; ++i )
    {
        // acquire a lock
        while( __sync_val_compare_and_swap( &lock, false, true ) == true )
        {
            // Spin while we don't acquire
        }

        // make sure we have the lock
        assert( lock == true );

        // release the lock
        assert( __sync_val_compare_and_swap( &lock, true, false ) == true );
    }
}

Ok, if I run the above code in 10 concurrent threads, all is well.

However, if I change the code to read

        // acquire a lock
        while( __sync_val_compare_and_swap( &lock, lock, true ) == true )

Notice I've changed "false" to "lock".

All hell breaks loose and the assertion

        // make sure we have the lock
        assert( lock == true );

Fires. Can anyone explain why this makes a difference ?

Thx Mark.

+2  A: 

It looks to me like __sync_val_compare_and_swap will always return the old value of the variable, even if no swap took place. In this case, suppose another thread holds the lock just before you try to acquire it - then lock is true, and you're calling __sync_val_compare_and_swap(&lock, true, true);. Just before the actual atomic compare-and-swap (but after the function arguments are determined), the other thread releases the lock - lock becomes false. The compare_and_swap then will return false, but will not have performed the swap operation, because the value it compared to was not the value in the lock. This thread didn't perform the swap, so the value of lock remains false, triggering your assertion.

Incidentally, I strongly suggest making lock a volatile bool. You don't want the compiler optimizing the references to such variables.

Aidan Cully
Thanks Aidan. I think you've cleared it up in my mind.. It's funny how I was so close to the code that I couldn't even think. This was a bug in some production code and I intuitively thought it was wrong but I couldn't for the life of me get my head around why.
ScaryAardvark
Yes, you're right.. Our production code does have it as a volatile and my quick test code didn't. I'll amend my question accordingly :)
ScaryAardvark
Even better might be `volatile sig_atomic_t`.
Aidan Cully