There are no threads in standard C++, and Threads cannot be implemented as a library.
Therefore, the standard has nothing to say about the behaviour of programs which use threads. You must look to whatever additional guarantees are provided by your threading implementation.
That said, in threading implementations I've used:
(1) yes, you can assume that irrelevant values aren't written to variables. Otherwise the whole memory model goes out the window. But be careful that when you say "another thread" never sets b
to false, that means anywhere, ever. If it does, that write could perhaps be re-ordered to occur during your loop.
(2) no, the compiler can re-order the assignments to b1 and b2, so it is possible for b1 to end up true and b2 false. In such a simple case I don't know why it would re-order, but in more complex cases there might be very good reasons.
[Edit: oops, by the time I got to answering (2) I'd forgotten that b was volatile. Reads from a volatile variable won't be re-ordered, sorry, so yes on a typical threading implementation (if there is any such thing), you can assume that you won't end up with b1 true and b2 false.]
(3) same as 1. volatile
in general has nothing to do with threading at all. However, it is quite exciting in some implementations (Windows), and might in effect imply memory barriers.
(4) on an architecture where int
writes are atomic yes, although volatile
has nothing to do with it. See also...
(5) check the docs carefully. Likely yes, and again volatile is irrelevant, because on almost all architectures int
writes are atomic. But if int
write is not atomic, then no (and no for the previous question), even if it's volatile you could in principle get a different value. Given those values 7 and 8, though, we're talking a pretty weird architecture for the byte containing the relevant bits to be written in two stages, but with different values you could more plausibly get a partial write.
For a more plausible example, suppose that for some bizarre reason you have a 16 bit int on a platform where only 8bit writes are atomic. Odd, but legal, and since int
must be at least 16 bits you can see how it could come about. Suppose further that your initial value is 255. Then increment could legally be implemented as:
- read the old value
- increment in a register
- write the most significant byte of the result
- write the least significant byte of the result.
A read-only thread which interrupted the incrementing thread between the third and fourth steps of that, could see the value 511. If the writes are in the other order, it could see 0.
An inconsistent value could be left behind permanently if one thread is writing 255, another thread is concurrently writing 256, and the writes get interleaved. Impossible on many architectures, but to know that this won't happen you need to know at least something about the architecture. Nothing in the C++ standard forbids it, because the C++ standard talks about execution being interrupted by a signal, but otherwise has no concept of execution being interrupted by another part of the program, and no concept of concurrent execution. That's why threads aren't just another library - adding threads fundamentally changes the C++ execution model. It requires the implementation to do things differently, as you'll eventually discover if for example you use threads under gcc and forget to specify -pthreads
.
The same could happen on a platform where aligned int
writes are atomic, but unaligned int
writes are permitted and not atomic. For example IIRC on x86, unaligned int
writes are not guaranteed atomic if they cross a cache line boundary. x86 compilers will not mis-align a declared int
variable, for this reason and others. But if you play games with structure packing you could probably provoke an example.
So: pretty much any implementation will give you the guarantees you need, but might do so in quite a complicated way.
In general, I've found that it is not worth trying to rely on platform-specific guarantees about memory access, that I don't fully understand, in order to avoid mutexes. Use a mutex, and if that's too slow use a high-quality lock-free structure (or implement a design for one) written by someone who really knows the architecture and compiler. It will probably be correct, and subject to correctness will probably outperform anything I invent myself.