views:

209

answers:

6

If I have something like this...

volatile long something_global = 0;

long some_public_func()
{
    return something_global++;
}

Would it be reasonable to expect this code to not break (race condition) when accessed with multiple threads? If it's not standard, could it still be done as a reasonable assumption about modern compilers?

NOTE: ALL I'm using this for is atomic increment and decrement - nothing fancier.

+13  A: 

No - volatile does not mean synchronized. It just means that every access will return the most up-to-date value (as opposed to a copy cached locally in the thread).

Post-increment is not an atomic operation, it is a memory access followed by a memory write. Interleaving two can mean that the value is actually incremented just once.

danben
What if I used preincrement?
Clark Gaebel
No, the only difference between the two is which value is returned. Pre-increment is also a memory access followed by a memory write.
danben
Aaaaaaaaaaargh.
Clark Gaebel
It's not the end of the world to put a mutex lock/unlock around the operation. People fear the fact that a thread might pend now and then. But in this case it is an exceedingly short operation being protected and the probability of two threads reaching there at exactly the same time is small (but finite).
Amardeep
The code is basically just a unique id factory. However, it's called often enough to make mutexing painful. Therefore, I was looking for a lock-free implementation. (This is on a server in case you're wondering).
Clark Gaebel
Amardeep is right - the memory barrier is the expensive part and it happens whether you use an atomic increment, or a lock protecting an increment.
Artelius
@wowus: did you measure the overhead?Another possibility here is to use `sem_t`, if you are only interested to increment atomically. For the read of it you still would have to protect it by a mutex, thought.
Jens Gustedt
@wowus: Why not just have your unique IDs be the thread ID concatenated with a thread-local counter? Then each thread can generate its own IDs without locking.
caf
caf's solution is a good one. And since you won't be sharing the counter, you avoid cache-line bouncing.
ninjalj
also with respect to atomics, it's coming in C1x, whenever that finishes the standardization process (gcc will likely implement it sooner though)
Spudd86
I find the statement "as opposed to a copy cached locally in the thread" unclear. One can have single threaded code that still needs volatile (ie: for signal handler changed values). The caching involved is simpler, and volatile is usually only needed to instruct the compiler not to use an in-register copy or a stack spill copy of a previous load if convienent.
Peeter Joot
@Artelius. Not all platforms require a barrier for an atomic increment (examples: powerpc, sparc), instead may only need a compare and swap or similiar mechanism. Whether or not a barrier is required is a different and more complex issue. Best to use a mutex (which typically imply the appropriate barriers too when required).
Peeter Joot
+2  A: 

On modern fast multicore processors, there is a significant overhead with atomic instructions due to caching and write buffers.

So compilers won't emit atomic instructions just because you added the volatile keyword. You need to resort to inline assembly or compiler-specific extensions (e.g. gcc atomic builtins).

I recommend using a library. The easy way is to just take a lock when you want to update the variable. Semaphores will probably be faster if they're appropriate to what you're doing. It seems GLib provides a reasonably efficient implementation.

Artelius
Is there a way to pull just the atomic library out of GLib? I'd rather not have a dependency just for atomics.
Clark Gaebel
+1  A: 

Volatile just prevents optimizations, but atomicity needs more. In x86, instructions must be preceeded by a LOCK prefix, in MIPS the RMW cycle must be surrounded by an LL/SC construct, ...

ninjalj
+4  A: 

No, you must use platform-dependent atomic accesses. There are several libraries that abstract these -- GLib provides portable atomic operations that fall back to mutex locks if necessary, and I believe Boost also provides portable atomics.

As I recently learned, for truly atomic access, you need a full memory barrier which volatile does not provide. All volatile guarantees is that the memory will be re-read at each access and that accesses to volatile memory will not be reordered. It is possible for the optimizer to re-order some non-volatile access before or after a volatile read/write -- possibly in the middle of your increment! -- so you must use actual atomic operations.

Michael E
Which boost library is that in? I couldn't find it (but I'm probably just blind).
Clark Gaebel
try the Boost.Thread library http://www.boost.org/doc/libs/1_43_0/doc/html/thread/synchronization.html
Sam Miller
@wowus there seems to be something in boost.interprocess here: http://www.boost.org/doc/libs/1_43_0/boost/interprocess/detail/atomic.hpp can't find docs for it though.
Michael E
+3  A: 

Windows provides InterlockedIncrement (and InterlockedDecrement) to do what you are asking.

Mark Ransom
Kinda what I'm looking for, except I need to be compatible with linux.
Clark Gaebel
Use TBB -- cross-platform (as long as it's x86/x86_64) reliable atomics (among many other things).
Larry Gritz
A: 

Your problem is that the C doesn't guarantee atomicity of the increment operators, and in practice, they often won't be atomic. You have to use a library like the Windows API or compiler builtin functions (GCC, MSVC) for that.

Christoph