ansaurus

Question

What's wrong with this fix for double checked locking?

Answer 1

+1 A:

There's some great reading about this (although it's .net/c# oriented) here: http://msdn.microsoft.com/en-us/magazine/cc163715.aspx

What it boils down to is that you need to be able to tell the CPU that it cannot reorder your reads/writes for this variable access (ever since the original Pentium, the CPU can reorder certain instructions if it thinks that the logic would be unaffected), and that it needs to ensure that the cache is consistent (don't forget about that -- we devs get to pretend that all memory is just one flat resource, but in reality, each CPU core has cache, some unshared (L1), some might be shared sometimes (L2)) -- your initizlization might write to main RAM, but another core might have the uninitialized value in cache. If you don't have any concurrency semantics, the CPU may not know that it's cache is dirty.

I don't know the C++ side, but in .net, you would designate the variable as volatile in order to protect access to it (or you would use the Memory read/write barrier methods in System.Threading).

As an aside, I've read that in .net 2.0, double checked locking is guaranteed to work without "volatile" variables (for any .net readers out there) -- that doesn't help you with your c++ code.

If you want to be safe, you will need to do the c++ equivalent of marking a variable as volatile in c#.

JMarsch 2009-06-03 15:05:43

C++ variables can be declared as volatile, but I doubt this has the exact same semantics as C#. I also remember reading somewhere that this was an abuse of volatile but I don't remember why so I can't judge how reasoned the article was.

Joseph Garvin 2009-06-03 15:57:57

In different languages, it might be an abuse (could even be an abuse in c#). One of the really difficult aspects of writing low-lock or lock free code has been the disparity in guidance. I have spent time reading about it, and it seems that even within Microsoft, some of the bloggers seem to contradict eachother on when you need a memory fence, and when you should use volatile. It's a difficult problem, to be sure.

JMarsch 2009-06-03 16:06:09

There is no equivalent of .NET volatile in current C++ (as defined by standard). It is one of the areas upcoming C++0x standard will bring. Meanwhile you need to use what your complier offers (which in Visual Studio means volatile and memory fence).

Suma 2009-06-16 15:56:00

volatile won't change with c++1x: it will keep being only single-thread aware, operating intra-thread. Use atomic<T> in C++1x

Johannes Schaub - litb 2009-06-17 21:58:43

Answer 2

A:

"The latter case breaks the idiom -- two threads might end up creating the singleton."

But if I understand the code correctly, the first example, you check if instance already exists (might be executed by multiple threads at the same time), if it doesn't one thread get's to lock it and it creates the instance - only one thread can execute the creation at that time. All other threads get locked out and will wait.

Once the instance is created and the mutex is unlocked the next waiting thread will lock mutex but it will not try to create new instance because the check will fail.

Next time the instance variable is checked it will be set so no threads will try to create new instance.

I'm not sure about the case where one thread is assigning new instance pointer to instance while another thread checks the same variable - but I believe it will be handled correctly in this case.

Am I missing something here?

Ok not sure about the reordering of operations but in this case it would be altering logic so I would not expect it to happen - but I'm no expert on this topic.

stefanB 2009-06-03 15:06:23

You're right -- I was wrong about the actual race condition. The problem is that a second thread may see instance is non-null and try to return it before the first-thread has constructed it. I've edited my post.

Joseph Garvin 2009-06-03 15:22:53

Answer 3

+4 A:

Your fix doesn't fix anything since the writes to sync_check and instance can be done out of order on the CPU. As an example imagine the first two calls to instance happen at approximately the same time on two different CPUs. The first thread will acquire the lock, initialize the pointer and set sync_check to true, in that order, but the processor may change the order of the writes to memory. On the other CPU then it is possible for the second thread to check sync_check, see that it is true, but instance may not yet be written to memory. See Lockless Programming Considerations for Xbox 360 and Microsoft Windows for details.

The thread specific sync_check solution you mention should work then (assuming you initialize your pointer to 0).

coombez 2009-06-16 15:49:47

Concerning your last sentence: Yes but, I'm not sure but I think that thread_specific_ptr use a mutex internally. So what would be the point of using that solution versus just always locking the mutex (no double-locking)?

n1ck 2010-06-09 19:28:38

ansaurus

tags:

views:

answers:

What's wrong with this fix for double checked locking?

related questions