So I've seen a lot of articles now claiming that on C++ double checked locking, commonly used to prevent multiple threads from trying to initialize a lazily created singleton, is broken. Normal double checked locking code reads like this:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
public:
static singleton & instance()
{
static singleton* instance;
if(!instance)
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
}
return *instance;
}
};
The problem apparently is the line assigning instance -- the compiler is free to allocate the object and then assign the pointer to it, OR to set the pointer to where it will be allocated, then allocate it. The latter case breaks the idiom -- one thread may allocate the memory and assign the pointer but not run the singleton's constructor before it gets put to sleep -- then the second thread will see that the instance isn't null and try to return it, even though it hasn't been constructed yet.
I saw a suggestion to use a thread local boolean and check that instead of instance
. Something like this:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
static boost::thread_specific_ptr<int> _sync_check;
public:
static singleton & instance()
{
static singleton* instance;
if(!_sync_check.get())
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
// Any non-null value would work, we're really just using it as a
// thread specific bool.
_sync_check = reinterpret_cast<int*>(1);
}
return *instance;
}
};
This way each thread ends up checking if the instance has been created once, but stops after that, which entails some performance hit but still not nearly so bad as locking every call. But what if we just used a local static bool?:
class singleton {
private:
singleton(); // private constructor so users must call instance()
static boost::mutex _init_mutex;
public:
static singleton & instance()
{
static bool sync_check = false;
static singleton* instance;
if(!sync_check)
{
boost::mutex::scoped_lock lock(_init_mutex);
if(!instance)
instance = new singleton;
sync_check = true;
}
return *instance;
}
};
Why wouldn't this work? Even if sync_check were to be read by one thread when it's being assigned in another the garbage value will still be nonzero and thus true. This Dr. Dobb's article claims that you have to lock because you'll never win a battle with the compiler over reordering instructions. Which makes me think this must not work for some reason, but I can't figure out why. If the requirements on sequence points are as lose as the Dr. Dobb's article makes me believe, I don't understand why any code after the lock couldn't be reordered to be before the lock. Which would make C++ multithreading broken period.
I guess I could see the compiler being allowed to specifically reorder sync_check to be before the lock because it's a local variable (and even though it's static we're not returning a reference or pointer to it) -- but then this could still be solved by making it a static member (effectively global) instead.
So will this work or won't it? Why?