views:

156

answers:

6

I have the following situation (caused by a defect in the code):

There's a shared variable of primitive type (let it be int) that is initialized during program startup from strictly one thread to value N (let it be 0). Then (strictly after the variable is initialized) during the program runtime various threads are started and they in some random order either read that variable or overwrite it with the very same value N (0 in this example). There's no synchronization around accessing the variable.

Can this situation cause unexpected behavior in the program?

+2  A: 

No. Of course, you could end up with a data race if one of the threads later tries to change the value. You will also end up with a little cache contention, but I doubt this will have noticable effect.

Peter Ruderman
A: 

If no other thread (and this includes the main thread) can change the value of the 0 to anything else (lets say 1) while those threads are initializing then it you will not have problems. But if any other thread had the potential to change the value during the start-up phase you could have a problem. You are playing a dangerous game and I would recommend locking before reading the value.

Scott Chamberlain
+1  A: 

You can not really rely on it. For primitive types you should be fine, and if the operation is atomic (eg a correctly aligned int on most platforms) then writing and reading different values is safe (note by this I mean like "x = 5;", not "x += 5;" which is never atomic and is not thread safe).

For non-primitive types even if its the same value all bets are off since there may be a copy constructor that does something un-safe (like allocating memory).

Fire Lancer
+1  A: 

Yes it is possible for unexpected behavior to happen in this scenario. Consider the case where the initial value of the variable was not 0. It is possible for one thread to start the set to 0 and another thread see the variable with only some of the bytes set.

For types int this is very unlikely as most processor's will have atomic assignment of word sized values. However once you hit 8 bit numeric values (long on some platforms) or large structs, this begins to be an issue.

JaredPar
He guarantees that the value is initialized by exactly one thread before anyone tries to read or overwrite it.
Peter Ruderman
It was not very clear from my question, so I updated it. There're no chances the variable is not reliably set to that value *before* the threads start concurrent work. In the situation you describe it is pretty clear there's risk of a race.
sharptooth
+3  A: 

Since C++ does not currently have a standard concurrency model, it would depend entirely on your threading implementation and whatever guarantees it gives. It is all but certainly unsafe in the general case, however, because of the potential for torn reads. There might be specific cases where it would "work" or at least "appear to work."

In C++0x (which does have a standard concurrency model), your scenario would formally result in undefined behavior. There is a long, detailed, hard-to-read specification of the concurrency model in the C++0x Final Committee Draft §1.10, but it basically boils down to this:

Two expression evaluations conflict if one of them modifies a memory location and the other one accesses or modifies the same memory location (§1.10/3).

The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior (§1.10/14).

Your expression evaluations clearly conflict because they modify and read the same memory location, and since the object is not atomic and access is not synchronized using a lock, you have undefined behavior.

James McNellis
Except that there's no potential for a torn read because each thread is writing the exact same value. It's effectively a no-op.
Peter Ruderman
@Peter: There is no guarantee as to _how_ writes are performed. As paxdiablo rightly explains in his answer, writing to a memory location could involve changing its value, so long as after the write is completed it has the right value.
James McNellis
I suppose that's true, but it seems kind of academic to me. Are you aware of any systems that behave this way?
Peter Ruderman
Yes, flash. Writes are commonly remapped. So if you have a value 17 and overwrite it with value 17, the flash controller will change the mapping and write the 17 to a previously zeroed block. A racing thread may see the zeroed block in between.
MSalters
+4  A: 

It's incredibly unlikely but not impossible according to the standard.

There's nothing stating what the underlying representation of an integer is, not does the standard specify how the values are loaded.

I can envisage, however weird, an implementation where the underlying bit pattern for 0 is 10101010 and the architecture only supports loading data into memory by bit-shifting it over eight cycles but reading it as a single unit in one cycle.

If another thread reads the value while the bit pattern is being shifted in (e.g., 00000001, 00000010,00000101 and so on), you will have a problem.

The chances of anyone designing such a bizarre architecture is so close to zero as to be negligible. But, unfortunately, it's not zero. All I'm trying to get across is that you shouldn't rely on assumptions at all when it comes to standards compliance.

And please, before you vote me down, feel free to quote the part of the standard that states this is not possible :-)

paxdiablo
Many people assume that writing to memory something that's already there will have no effect. For "normal" RAM, that's pretty much true, but some types of byte-upgradable flash cannot be read for a certain amount of time after they're written. I don't think I've seen a memory which had both slow writes and unlimited endurance, but it's conceivable. More notably, writing to a location which is being read in another thread may cause caching conflicts. These would not normally cause errant execution, but could cause delays.
supercat
It's very hard to imagine a computer system that would allow one processor to read a weird, intermediate value from a memory location while another processor was in the process of updating it. All systems of which I'm aware (which is admittedly not very many), perform basic synchronization at the hardware level. If one processor writes to a location while another reads from it, the second processor will see the value either before or after the update, but not something in between.
Peter Ruderman
@Peter, that's not always so. GCC has to provide special intrinsics for atomic access specifically because it's not safe to assume that. I also seem to recall that x86 has a LOCK prefix to try and sync multi-CPUs but it's not done at the hardware level automatically. But all that us irrelevant. What is common practice, and what is possible according to the standard, does not always match.
paxdiablo
I agree with your point about the standard. But I think it's worth remembering that, regardless of what the standard says, the program must run on physical hardware, and the hardware will determine how this will work. I'd also point out that the interlocked intrinsics _all_ handle combined read/write operations. If special instructions are needed for simple atomic reads and atomic writes, then I have to wonder why the corresponding intrinsics do not exist.
Peter Ruderman