tags:

views:

151

answers:

3

I've been thinking of how to implement a lock-free singly linked list. And to be honest, I don't see many bullet proof ways to do it. Even the more robust ways out there that use CAS end up having some degree of the ABA problem.

So I got to thinking. Wouldn't a partially lock-free system be better than always using locks? May some operations can be atomic and lock-free? If I can do that, it should still be thread safe.

So, onto the question. I am thinking of a simple singly linked list. 2 main operations. push and pop. push always inserts at the front. Something like this:

void push(int n) {
    T *p = new T;
    p->n = n;
    p->next = root;
    root = p;
}

And pop which always takes the first element. Something like this:

T *pop() {
    T *p = root;
    root = root->next;
    return p;
}

Obviously push is non-trivial enough that a simple lock-free approach is probably not going to happen. But pop looks maybe doable. Using gcc-intrinsics I've thought of this:

T *pop() {
    return __sync_lock_test_and_set(&root, root->next);
}

Functionally equivalent? yup. Lock-free? yup. Thread Safe? I Don't Know. My gut reaction is no, and here's why.

I am concerned about the fact that one of the parameters to the test_and_set has to dereference memory. What if root changes between the root->next and the call to __sync_lock_test_and_set.

I suppose that this code is equivalent to this:

T *pop() {
    T *temp = root->next;
    // are we broken if a push/pop happens here?
    return __sync_lock_test_and_set(&root, temp);
}

So, like I said, I think that this code is not correct. But can anyone say for certain that I am drawing the right conclusions (I'd hate to write off something that works nicely). If it is in fact broken as I suspect. Is there any simple solution?

+2  A: 

You're correct. In C++, the arguments to a function are evaluated in any order, but certainly your compiler has no way of knowing that root->next is an atomic operation in your sequence.

Consider two threads calling pop(): One thread evaluates root->next, then the other evaluates root->next, and both call test_and_set(). Now you've only popped one node.

rlbond
A: 

In both versions of pop:

T *pop() {
    T *p = root;
    root = root->next;
    return p;
}

and

T *pop() {
    return __sync_lock_test_and_set(&root, root->next);
}

You have an error, already, which is that you aren't verifying that your list/stack isn't empty before reading from the supposed root node.

This compounds the problem you mentioned about having to dereference root to get at next before test_and_set even happens. It essentially becomes a test_and_then_test_and_set operation, where the and_then signifies that more than one step is needed.

Your first version of pop needs to be:

T *pop() {
    T *p = root;
    if (root) {
        root = root->next;
    }
    return p;
}

and as I'm sure you can see this puts even more steps in the mix.

nategoose
Sure, clearly a pop on an empty list is undefined behavior. But that's somewhat a side issue to the real question.
Evan Teran
+1  A: 

Two things: (1) test&set only has consensus number of 2; for such a weak synchronization primitive it is enough to just use read/write memory barriers without incurring the overhead of the specialized instructions; (2) the ABA problem is a real problem with woefully few solutions; however, with a CAS (cmpxchg8b on 32-bit systems and cmpxchg16b on 64-bit systems for x86/-64) there is enough space in the upper portion of the register to store such a large time-stamp that ABA never occurs in practice (in even fairly contrived settings it would require one thread to stall for several days or weeks and then to wake-up at precisely the correct moment).

I think, though, you are trying to implement a lock-free queue (and not a list). Queues are far easier to implement than lists. The papers "An optimistic approach to lock-free FIFO queues" by Edya Lazan-Mozes and Nir Shavit and "Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms" by Maged M. Michael and Michael L. Scott are both very informative and easy to approach for the implementation of a lock-free queue.

However, if you insist on a lock-free linked-list consider the implementation in "Lock-Free Linked Lists and Skip Lists" by Michail Fomitchev and Eric Ruppert. You might also look into Damian Dechev's lock-free dynamic array (there is a link off of Wikipedia).

thechao
really what I am trying to implement is a free-list (linked list based stack). Since I push/pop on the same end :-). Thanks for your answer.
Evan Teran
Then you should read "A Scalable Lock-free Stack Algorithm" by Danny Hendler, Nir Shavit, and Lena Yerushalmi. Also, very approachable and easy to implement. Although, a I have found that a lock-free queue (for a free list) will have (slightly) better run-time characteristics than a stack; it depends on your platform.
thechao