views:

273

answers:

4

Hi,

I am aware that the purpose of volatile variables in Java is that writes to such variables are immediately visible to other threads. I am also aware that one of the effects of a synchronized block is to flush thread-local memory to global memory.

I have never fully understood the references to 'thread-local' memory in this context. I understand that data which only exists on the stack is thread-local, but when talking about objects on the heap my understanding becomes hazy.

I was hoping that to get comments on the following points:

  1. When executing on a machine with multiple processors, does flushing thread-local memory simply refer to the flushing of the CPU cache into RAM?

  2. When executing on a uniprocessor machine, does this mean anything at all?

  3. If it is possible for the heap to have the same variable at two different memory locations (each accessed by a different thread), under what circumstances would this arise? What implications does this have to garbage collection? How aggressively do VMs do this kind of thing?

  4. (EDIT: adding question 4) What data is flushed when exiting a synchronized block? Is it everything that the thread has locally? Is it only writes that were made inside the synchronized block?

    Object x = goGetXFromHeap(); // x.f is 1 here    
    Object y = goGetYFromHeap(); // y.f is 11 here
    Object z = goGetZFromHead(); // z.f is 111 here
    
    
    y.f = 12;
    
    
    synchronized(x)
    {
        x.f = 2;
        z.f = 112;
    }
    
    
    // will only x be flushed on exit of the block? 
    // will the update to y get flushed?
    // will the update to z get flushed?
    

Overall, I think am trying to understand whether thread-local means memory that is physically accessible by only one CPU or if there is logical thread-local heap partitioning done by the VM?

Any links to presentations or documentation would be immensely helpful. I have spent time researching this, and although I have found lots of nice literature, I haven't been able to satisfy my curiosity regarding the different situations & definitions of thread-local memory.

Thanks very much.

A: 

It is really an implementation detail if the current content of the memory of an object that is not synchronized is visible to another thread.

Certainly, there are limits, in that all memory is not kept in duplicate, and not all instructions are reordered, but the point is that the underlying JVM has the option if it finds it to be a more optimized way to do that.

The thing is that the heap is really "properly" stored in main memory, but accessing main memory is slow compared to access the CPU's cache or keeping the value in a register inside the CPU. By requiring that the value be written out to memory (which is what synchronization does, at least when the lock is released) it forcing the write to main memory. If the JVM is free to ignore that, it can gain performance.

In terms of what will happen on a one CPU system, multiple threads could still keep values in a cache or register, even while executing another thread. There is no guarantee that there is any scenario where a value is visible to another thread without synchronization, although it is obviously more likely. Outside of mobile devices, of course, the single-CPU is going the way of floppy disks, so this is not going to be a very relevant consideration for long.

For more reading, I recommend Java Concurrency in Practice. It is really a great practical book on the subject.

Yishai
It was specified in the old Java Memory Model (JMM), but that is long gone.
Tom Hawtin - tackline
Thanks for the comments. Does this mean that synchronisation just refers to the flushing of the CPU cache to main memory? Or, are there scenarios where the same variable can exist in two different locations on the heap?
Jack Griffith
@Jack, No, it can also refer to instruction reordering (so things can be written to main memory, but in a way that would appear to be wrong), and of course locking. I can't imagine a JVM implementation actually making a copy of an object in shared memory in unsynchronized code, but I don't know of anything in the specification that doesn't allow it.
Yishai
+1  A: 

The flush you are talking about is known as a "memory barrier". It means that the CPU makes sure that what it sees of the RAM is also viewable from other CPU/cores. It implies two things:

  • The JIT compiler flushes the CPU registers. Normally, the code may kept a copy of some globally visible data (e.g. instance field contents) in CPU registers. Registers cannot be seen from other threads. Thus, half the work of synchronized is to make sure that no such cache is maintained.

  • The synchronized implementation also performs a memory barrier to make sure that all the changes to RAM from the current core are propagated to main RAM (or that at least all other cores are aware that this core has the latest values -- cache coherency protocols can be quite complex).

The second job is trivial on uniprocessor systems (I mean, systems with a single CPU which has as single core) but uniprocessor systems tend to become rarer nowadays.

As for thread-local heaps, this can theoretically be done, but it is usually not worth the effort because nothing tells what parts of the memory are to be flushed with a synchronized. This is a limitation of the threads-with-shared-memory model: all memory is supposed to be shared. At the first encountered synchronized, the JVM should then flush all its "thread-local heap objects" to the main RAM.

Yet recent JVM from Sun can perform an "escape analysis" in which a JVM succeeds in proving that some instances never become visible from other threads. This is typical of, for instance, StringBuilder instances created by javac to handle concatenation of strings. If the instance is never passed as parameter to other methods then it does not become "globally visible". This makes it eligible for a thread-local heap allocation, or even, under the right circumstances, for stack-based allocation. Note that in this situation there is no duplication; the instance is not in "two places at the same time". It is only that the JVM can keep the instance in a private place which does not incur the cost of a memory barrier.

Thomas Pornin
Thank you for the comments. I'm actually familiar with escape analysis and it started my confusion with 'thread-local'. I would like to ask two follow up questions please:1. If the compiler has proven an object to be thread-local, and the object exists in a thread-local region of the heap, then why do writes to this object inside a synchronized block need to be flushed? The flush from CPU cache to thread-local heap region would only ever be observable by the thread that made the write? Is this incase the thread switches processors and begins executing with a different CPU cache?
Jack Griffith
2. Is it possible for a JVM to have an object exist simultaneously in two separate memory locations on the heap? If so, under what circumstances would this arise?
Jack Griffith
1. A `synchronized` implies a flush of "everything". The `synchronized` has a parameter which is the instance on which the lock is taken, but the Java memory model mandates that the whole memory view from the thread is subject to the memory barrier. Now, if the JVM can prove that it needs not flush an object because no other thread may see it (and "unescaped objects" are good candidates) then the JVM is free to not flush, under the "as if" rule (the JVM can do everything it wishes as long as the result is not distinguishable from the Java abstract machine).
Thomas Pornin
2. Within the Java framework, an instance is an instance, not two, and all references to that instance must compare equal. If, _under the hood_, the JVM duplicates an object, then it must take care to make reference comparisons in such a way that the duplicates appear as if they were a single instance.Some GC "move around" objects in RAM, which means that there are times when a given object exists in "two places". The default GC in Sun's JVM makes this transient; duplicates occur only during a "pause", with all threads stopped. Some other GC types may tolerate long-lived duplicates.
Thomas Pornin
A: 

It's not as simple as CPU-Cache-RAM. That's all wrapped up in the JVM and the JIT and they add their own behaviors.

Take a look at The "Double-Checked Locking is Broken" Declaration. It's a treatise on why double-checked locking doesn't work, but it also explains some of the nuances of Java's memory model.

Devon_C_Miller
A: 

One excellent document for highlighting the kinds of problems involved, is the PDF from the JavaOne 2009 Technical Session

This Is Not Your Father's Von Neumann Machine: How Modern Architecture Impacts Your Java Apps

By Cliff Click, Azul Systems; Brian Goetz, Sun Microsystems, Inc.

Stephen Denne