views:

236

answers:

2

I've several writers(threads) and one reader on a stl vector.

Normal writes and reads are mutex protected, but I would like to avoid contention on a loop I have and I was wondering if vector::size would be safe enough, I suppose it depends on implementations, but since normally vector dynamic memory is for the stored items the memory where the size is stored shouldn't be invalidated during reallocations.

I don't mind to have false positives, after a size > 0 I'll actually lock and check again so if reading size() while another thread writes doesn't segfault ever it should be safe enough for me.

+7  A: 

I don't know off-hand of an implementation in which concurrent reads and writes to an integer segfault (although the C++03 standard does not prohibit that, and I don't know whether POSIX does). If the vector uses pImpl, and doesn't store the size in the vector object itself, you could maybe have problems where you try to read the size from a pImpl object which has been freed in another thread. For example, GCC on my machine does use a pImpl (and doesn't store the size directly - it's calculated as the difference between begin() and end(), so there's obvious opportunities there for race conditions during modification).

Even if it doesn't crash, though, it might very well give a meaningless or wrong answer. If you don't lock then the value you read could for example be:

  • read non-atomically, meaning you get the most significant half of one value and the least significant half of another. In practice, reading size_t is probably atomic on most implementations, since there are good reasons for size_t to be the natural word size of the architecture. But if it happens, this could read a value as 0 when neither the "before" not the "after" was 0. Consider for example the transition 0x00FF -> 0x0100. If you get the bottom half of the "after" and the top half of "before", you've read 0.

  • arbitrarily stale. Without locking (or some other memory barrier), you could get a value out of a cache. If that cache is not shared with other CPUs/cores, and if your architecture does not have so-called 'coherent caches', then a different CPU or core running a different thread could have changed the size six weeks ago, and you will never see the new value. Furthermore, different addresses might be different amounts stale - without memory barriers, if another thread has done a push_back you could conceivably "see" the new value at the end of your vector but not "see" the increased size.

A lot of these problems are hidden on common architectures. For instance, x86 has coherent caches, and MSVC guarantees full memory barriers when accessing volatile objects. ARM doesn't guarantee coherent caches, but in practice multi-core ARM isn't that common, so double-checked locking normally works there too.

These guarantees solve some difficulties and allow some optimisations, which is why they're made in the first place, but they're not universal. Obviously you can't write multi-threaded code at all without making some assumptions beyond the C++ standard, but the more vendor-specific guarantees you rely on, the less portable your code is. It's not possible to answer your question other than with reference to a particular implementation.

If you're writing portable code, then really you should think of all memory reads and writes as potentially being to your thread's own private memory cache. Memory barriers (including locks) are a means to "publish" your writes and/or "import" writes from other threads. The analogy with version-control systems (or your favourite other example of local copies of anything) is clear, with the difference that things might be published/imported at any time, even if you don't ask them to be. And of course there's no merging or conflict detection, unless the industry has finally implemented transactional memory while I wasn't looking ;-)

In my opinion, multi-threaded code should first avoid shared memory, then lock if absolutely necessary, then profile, then worry about contention and lock-free algorithms. Once you get to the last stage, you need to research and follow well-tested principles and patterns for your particular compiler and architecture. C++0x will help somewhat by standardising some things you can rely on, and some of Herb Sutter's "Effective Concurrency" series goes into details how to make use of these guarantees. One of the articles has an implementation of a lock-free multi-writer single-reader queue, which may or may not be adaptable for your purposes.

Steve Jessop
Wonderfully thorough answer. The analogy with version control is not just apt, but also useful.
quark
+3  A: 

It sounds like you have a producer/consumer problem and you're polling the size of the vector as a flag to consume. It would be better to use a semaphore to control your reader trying to do work, and a mutex to control access to the vector. When the vector is empty, have the reader block on the semaphore until a writer puts something in the vector and increments the semaphore. Then both reader and writers use the mutex to modify the vector itself. Never poll if you can block.

Rob K