views:

95

answers:

3

What is the difference between memory consistency errors and thread interference? How does the use of synchronization to avoid them differ or not? Please illustrate with an example. I couldn't get this from the sun Java tutorial. Any recommendation of reading material(s) to understand this purely in context of java would be helpful.

A: 

Memory Consistency problems are normally manifest as broken happens-before relationships.

Time A: Thread 1 sets int i = 1
Time B: Thread 2 sets i = 2
Time C: Thread 1 reads i, but still sees a value of 1, because of any number of reasons that it did not get the most recent stored value in memory.

You prevent this from happening either by using the volatile keyword on the variable, or by using the AtomicX classes from the java.util.concurrent.atomic package. Either of these messages makes sure that no second thread will see a partially modified value, and no one will ever see a value that isn't the most current real value in memory.

(Synchronizing the getter and setter would also fix the problem, but may look strange to other programmers who don't know why you did it, and can also break down in the face of things like binding frameworks and persistence frameworks that use reflection.)

--

Thread interleaves are when two threads munge an object up and see inconsistent states.

We have a PurchaseOrder object with an itemQuantity and itemPrice, automatic logic generates the invoice total.

Time 0: Thread 1 sets itemQuantity 50
Time 1: Thread 2 sets itemQuantity 100
Time 2: Thread 1 sets itemPrice 2.50, invoice total is calculated $250
Time 3: Thread 2 sets itemPrice 3, invoice total is calculated at $300

Thread 1 performed an incorrect calculation because some other thread was messing with the object in between his operations.

You address this issue either by using the synchronized keyword, to make sure only one person can perform the entire process at a time, or alternately with a lock from the java.util.concurrent.locks package. Using java.util.concurrent is generally the preferred approach for new programs.

Affe
Your first example doesn't make sense because there is no guarantee of a deterministic single clock relating operations on multiple CPUs. Also, virtually all CPUs in common use guarantee that happens-before isn't violated with respect to a single memory address--you need multiple memory addresses to illustrate the concept effectively. Once you get rid of the broken single-clock idea, you're left with viewing the operations from the perspective of one thread at a time, at which point your example boils down to simple thread interference.
blucz
While everything you've written is correct, I don't think it helps OP with his goal of learning to write a safe java program. The Java Virtual Machine has its own internal memory model that isolates the program from the nuances of the underlying platform. The example I gave is the problem that he will actually run into in an improperly written java program, although as you say not strictly fitting a computer engineering definition of a memory consistency issue.
Affe
I suggest you check your understanding of the JVM specification. The JVM does not hide the "nuances" that I was using as examples. The point I'm making is that both of your examples are thread interference. MCE's *must* involve the underlying platform, by definition.
blucz
For whatever arguing on the internet in a week old thread is worth - That example is the Sun/Oracle tutorial's example of a Memory Consistency problem that the OP referred to in his question. So I tried to supplement it with some more information to make the difference between the two examples more clear to him. If the original article is wrong, I guess we should let Oracle know! :)
Affe
+3  A: 

Memory consistency errors can't be understood purely in the context of java--the details of shared memory behavior on multi-cpu systems are highly architecture-specific, and to make it worse, x86 (where most people coding today learned to code) has pretty programmer-friendly semantics compared to architectures that were designed for multi-processor machines from the beginning (like POWER and SPARC), so most people really aren't used to thinking about memory access semantics.

I'll give a common example of where memory consistency errors can get you into trouble. Assume for this example, that the initial value of x is 3. Nearly all architectures guarantee that if one CPU executes the code:

STORE 4 -> x     // x is a memory address
STORE 5 -> x 

and another CPU executes

LOAD x
LOAD x

will either see 3,3, 3,4, 4,4, 4,5, or 5,5 from the perspective its two LOAD instructions. Basically, CPUs guarantee that the order of writes to a single memory location is maintained from the perspective of all CPUs, even if the exact time that each of the writes become known to other CPUs is allowed to vary.

Where CPUs differ from one another tends to be in the guarantees they make about LOAD and STORE operations involving different memory addresses. Assume for this example, that the initial values of both x and y are 4.

STORE 5 -> x   // x is a memory address
STORE 5 -> y // y is a different memory address

then another CPU executes

LOAD x
LOAD y

In this example, on some architectures, the second thread can see 4,4, 5,5, 4,5, OR 5,4. Ouch!

Most architectures deal with memory at the granularity of a 32 or 64 bit word--this means that on a 32 bit POWER/SPARC machine, you can't update a 64-bit integer memory location and safely read it from another thread ever without explicit synchronization. Goofy, huh?

Thread interference is much simpler. The basic idea is that java doesn't guarantee that a single statement of java code executes atomically. For example, incrementing a value requires reading the value, incrementing it, then storing it again. So you can have int x = 1 after two threads execute x++, x can end up as 2 or 3 depending on how the lower-level code interleaved (the lower-level abstract code at work here presumably looks like LOAD x, INCREMENT, STORE x). The basic idea here is that java code is broken down into smaller atomic pieces and you don't get to make assumptions of how they interleave unless you use synchronization primitives explicitly.

For more information, check out this paper. It's long and dry and written by a notorious asshole, but hey, it's pretty good too. Also check out this (or just google for "double checked locking is broken"). These memory reordering issues reared their ugly heads for many C++/java programmers who tried to get a little bit too clever with their singleton initializations a few years ago.

blucz
+1  A: 

The article to read on this is "Memory Models: A Case for Rethinking Parallel Languages and Hardware" by Adve and Boehm in the August 2010 vol. 53 number 8 issue of Communications of the ACM. This is available online for Association for Computer Machinery members (http://www.acm.org). This deals with the problem in general and also discusses the Java Memory Model.

For more information on the Java Memory Model, see http://www.cs.umd.edu/~pugh/java/memoryModel/

Mark Lutton