tags:

views:

389

answers:

5

As Scott Meyers and Andrei Alexandrescu outlined in this article the simple try to implement the double-check locking implementation is unsafe in C++ specifically and in general on multi-processor systems without using memory barriers.

I was thinking a little bit about that and came to a solution that avoids using memory barriers and should also work 100% safe in C++. The trick is to store a copy of the pointer to the instance thread-local so each thread has to acquire the lock for the first times it access the singleton.

Here is a little sample code (syntax not checked; I used pthread but all other threading libs could be used):

class Foo
 { 
 private:
   Helper *helper;
   pthread_key_t localHelper;
   pthread_mutex_t mutex;
 public:
   Foo()
     : helper(NULL)
   {
     pthread_key_create(&localHelper, NULL);
     pthread_mutex_init(&mutex);
   }
   ~Foo()
   {
     pthread_key_delete(&localHelper);
     pthread_mutex_destroy(&mutex);
   }
   Helper *getHelper()
   {
     Helper *res = pthread_getspecific(localHelper);
     if (res == NULL)
     {
       pthread_mutex_lock(&mutex);
       if (helper == NULL)
       {
         helper = new Helper();
       }
       res = helper;
       pthread_mutex_unlock(&mutex);
       pthread_setspecific(localHelper, res);
     }
     return res;
   }
 };

What are your comments/opinions?

Do you find any flaws in the idea or the implementation?

EDIT:

Helper is the type of the singleton object (I know the name is not the bet...I took it from the Java examples in the Wikipedia article about DCLP). Foo is the Singleton container.

EDIT 2:

Because it seems to be a little bit misunderstanding that Foo is not a static class and how it is used, here an example of the usage:

static Foo foo;

.
.
.

foo.getHelper()->doSomething();

.
.
.

The reason that Foo's members are not static is simply that I was able to create/destroy the mutex and the TLS in the constructor/destructor. If a RAII version of a C++ mutex / TLS class is used Foo can easily be switched to be static.

+4  A: 

You seem to be calling:

pthread_mutex_init(&mutex);

in the Helper() constructor. But that constructor is itself called in the function getHelper() (which should be static, I think) which uses the mutex. So the mutex appears to be initialised twice or not at all.

I find the code very confusing, I must say. Double-checked locking is not that complex. Why don't you start again, and this time create a Mutex class, which does the initialisation, and uses RAI to release the underlying pthread mutex. Then use this Mutex class to implement your locking.

anon
I edited the question...The names of the constructor/destructor were wrong.Yes, I could use a RAII version of locking/unlocking the mutex. That is simply possible but is not core part of the logic.
rstevens
+1  A: 

This isn't the double-checked locking pattern. Most of the potential thread safety issues of the pattern are due to the fact the the a common state is read outside of a mutually exclusive lock, and then re-checked inside it.

What you are doing is checking a thread local data item, and then checking the common state inside a lock. This is more like a standard single check singleton pattern with a thread local cached value optimization.

To a casual glance it does look safe, though.

Charles Bailey
It is indeed not the exact DCLP but the idea was to have a replacement pattern that is safe combined with the advantage of DCLP (no need to lock the critical section at each access).
rstevens
That's fine, but your question reads like it's supposed to be DCLP implementation not a DCLP replacement. When using a lock/check singelton creation approach there's nothing to stop you caching the pointer in a thread once its created. What you've done is effectively provide this automatically.
Charles Bailey
I changed the title so now it is a replacement instead of an implementation :-)
rstevens
+1  A: 

Looks interesting! Clever use of thread-local storage to reduce contention.

But I wonder if this is really different from the problematic approach outlined by Meyers/Alexandrescu...?

Say you have two threads for which the singleton is uninitialized (e.g. thread local slot is empty) and they run getHelper in parallel.

Won't they get into the same race over the helper member? You're still calling operator new and you're still assigning that to a member, so the risk of rogue reordering is still there, right?

EDIT: Ah, I see now. The lock is taken around the NULL-check, so it should be safe. The thread-local replaces the "first" NULL-check of the DCLP.

Kim Gräsman
A: 

Are you trying to make a thread-safe singleton or, implement Meyers & Andrescu's tips? The most straightforward thing is to use a

pthread_once

as is done below. Obviously you're not going for lock-free speed or anything, creating & destroying mutexes as you are so, the code may as well be simple and clear - this is the kind of thing pthread_once was made for. Note, the Helper ptr is static, otherwise it'd be harder to guarantee that there's only one Helper object:

// Foo.h

#include <pthread.h>

class Helper {};

class Foo
{
private:
   static Helper* s_pTheHelper;
   static ::pthread_once_t once_control;

private:
   static void createHelper() { s_pTheHelper = new Helper(); }

public:
   Foo()
   { // stuff
   }

   ~Foo()
   { // stuff
   }

   static Helper* getInstance()
   {
      ::pthread_once(&once_control, Foo::createHelper);
      return s_pTheHelper;
   }
};

// Foo.cpp
// ..
Helper* Foo::s_pTheHelper = NULL;
::pthread_once_t Foo::once_control = PTHREAD_ONCE_INIT;
// ..
JDonner
Your code creates a singleton instance per thread which was not what I intended or doing in my code.My approach was to give another option to the ones Scott Meyers and Andrei Alexandrescu gave. An option that combines the "real" singletoness without locking for each access and without using memory barriers/fences.
rstevens
Your code uses pthread_key_create and pthread_getspecific suggesting that you wanted a singleton per thread(in which case the locking semantics becomes easier (as in you don't need locking.) If you don't want singelton-per-thread ,what's the pthread_getspecific used for ?
nos
@nos: I hold a copy of the pointer (to the same singleton) in each thread. This avoids the need to lock to check/create the singleton after the first run of the thread.The lock is needed to create the instance only once per program, not per thread.
rstevens
@rstevens(1): No, mine is not per-thread, pthread_once is once per app, not once per thread. And, the pointer's being static is per-app also.One theoretical problem with making Foo a static instanceinstead of Helper a static member is that it relies on everyone's discipline to not make another Foo somewhere. A static member is guaranteed unique by the compiler+linker.
JDonner
@rstevens: If you make Helper a static member, you can still deleteit in ~Foo(). Then, you won't be tempted to think you needthread-specific stuff. You know of course, that once Helper isallocated it can be read by all threads (every mutex lock or unlockincludes a memory barrier, meaning that all global variable changeswill be flushed to other threads). I don't think the thread-specificstuff buys you anything in any case.
JDonner
(urk - "flushed to other threads" -> "flushed to other processors")
JDonner
+1  A: 

Obviously a few people are misunderstanding your intent/solution. Something to think about.

I think the real question to be asked is this:

Is calling pthread_getspecific() cheaper than a memory barrier?

As I don't need any memory barrier in my algorithm, this cannot be the question.
rstevens
Exactly. You use pthread_getspecific exactly where a typical implementation would use a memory barrier.Which is more efficient?