views:

156

answers:

2

The Java language spec defines semantics of final fields in section 17.5:

The usage model for final fields is a simple one. Set the final fields for an object in that object's constructor. Do not write a reference to the object being constructed in a place where another thread can see it before the object's constructor is finished. If this is followed, then when the object is seen by another thread, that thread will always see the correctly constructed version of that object's final fields. It will also see versions of any object or array referenced by those final fields that are at least as up-to-date as the final fields are.

My question is - does the 'up-to-date' guarantee extend to the contents of nested arrays, and nested objects?

In a nutshell: If one thread assigns a mutable object graph to a final field in an object, and the object graph is never updated, can all threads safely read that object graph via the final field?

An example scenario:

  1. Thread A constructs a HashMap of ArrayLists, then assigns the HashMap to final field 'myFinal' in an instance of class 'MyClass'
  2. Thread B sees a (non-synchronized) reference to the MyClass instance and reads 'myFinal', and accesses and reads the contents of one of the ArrayLists

In this scenario, are the members of the ArrayList as seen by Thread B guaranteed to be at least as up to date as they were when MyClass's constructor completed?

I'm looking for clarification of the semantics of the Java Memory Model and language spec, rather than alternative solutions like synchronization. My dream answer would be a yes or no, with a reference to the relevant text.

Updates:

  • I'm interested in the semantics of Java 1.5 and above, i.e. with the updated Java Memory Model introduced via JSR 133. The 'up-to-date' guarantee on final fields was introduced in this update.
+2  A: 

In this scenario, are the members of the ArrayList as seen by Thread B guaranteed to be at least as up to date as they were when MyClass's constructor completed?

Yes, they are.

A thread is required to read memory when it encounters reference for the first time. Because hash map is constructed, all entries in it are brand new, then the references to objects are up-to-date to what they were when the constructor has finished.

After that initial encounter, the usual visibility rules apply. So, when other thread changes non-final field in the final references, the other thread may not see that change, but it still will see the reference that came out of constructor.

In reality, it means that if you do not modify final hash-map after the constructor, its contents are constants for all threads.

EDIT

I knew that I've seen this guarantee somewhere before.

Here is a paragraph of interest from this article that describes JSR 133

Initialization safety

The new JMM also seeks to provide a new guarantee of initialization safety -- that as long as an object is properly constructed (meaning that a reference to the object is not published before the constructor has completed), then all threads will see the values for its final fields that were set in its constructor, regardless of whether or not synchronization is used to pass the reference from one thread to another. Further, any variables that can be reached through a final field of a properly constructed object, such as fields of an object referenced by a final field, are also guaranteed to be visible to other threads as well. This means that if a final field contains a reference to, say, a LinkedList, in addition to the correct value of the reference being visible to other threads, also the contents of that LinkedList at construction time would be visible to other threads without synchronization. The result is a significant strengthening of the meaning of final -- that final fields can be safely accessed without synchronization, and that compilers can assume that final fields will not change and can therefore optimize away multiple fetches.

Alexander Pogrebnyak
The things I'm worried about are stale references to objects in the ArrayLists, and compiler reorderings, e.g. making an ArrayList instance available in the HashMap before the ArrayList is fully initialized. I'm keen to know what ordering guarantees are offered when assigning a mutable object graph to a final field.
mattbh
@mattbh. If mutable object had existed before the constructor call, then the usual visibility constraints apply, but the final reference seen by all threads will point to the same object. Though, in absence of synchronization, each thread may see it a little different.
Alexander Pogrebnyak
@Alexander. Thanks for your answer. Is there anything in the specs which prevents the compiler from making the references to those inner objects available before their (non-final) fields are initialised? The compiler is free to do this normally for non-final fields. The FinalFieldExample class in section 17.5 of the Java spec illustrates how this reordering is allowed to happen.
mattbh
@mattbh. If you don't shoot yourself in the foot by sharing the half constructed objects with other threads ( say by passing `this` inside constructor or initializer ), then nobody, but your calling thread will see the updates to non-final fields. In the context of the same thread it sees the most `up-to-date` value of the field. The `FinalFieldExample` shows what troubles may beset you after you exit the constructor and start calling unsynchronized methods and change state of the non-final fields. If, on the other hand, you never change them after the constructor, they will remain constant.
Alexander Pogrebnyak
@Alexander: The FinalFieldExample is showing something more insidious than that -- a case where fields are initialised once in the constructor, then left unchanged, but where another thread sees the reference to the object, and sees values in that object as they were before the constructor completed. That's what I'm worried about here: references to objects being made visible before the post-constructor state is made visible.
mattbh
@mattbh: No thread can see the object before the constructor is finished, unless you take `special` care to share it by calling a method with half constructed `this`. `FinalFieldExample` shows evil deeds done `AFTER` the constructor is finished. Reed the paragraph at the end more closely.
Alexander Pogrebnyak
@Alexander: "No thread can see the object before the constructor is finished" Yes it can for non-final fields, that's precisely what 17.5 is about -- that we can use 'final' fields to prevent this reordering.
mattbh
@Alexander: There are no mutations to the FinalFieldExample's instance variables after the ctor -- the only mutation is the write to static field 'f'. The example shows that a thread (calling reader()) can see an instance of FinalFieldExample (read via static field 'f'), and see its non-final field 'y' with values from before the ctor finished (in this case with the default value 0, before the assignment to value 4 in the ctor). It's a really counter-intuitive example!
mattbh
@mattbh: See the new edit in my posting
Alexander Pogrebnyak
@Alexander: Thanks--that's exactly the sort of reference I was hoping for!
mattbh
+1  A: 

If the constructor is written like this, you should have no issue:

public class MyClass {
    public final Map myFinal;
    public MyClass () {
        Map localMap = new HashMap();
        localMap.put("key", new ArrayList());
        this.myFinal = localMap;
    }
}

This is because the map is fully initialized before it's assigned to the public reference. Once the constructor completes, the final Map will be up-to-date.

eqbridges
What if the ArrayList is populated with arbitrary mutable values? Are their contents guaranteed to be visible (and up to date) from other threads?
mattbh
yes, anything that's done before the assignment of the final variable, is visible to other threads.
irreputable
@mattbh: if the members of the arraylist are initialized before the map is assigned to this.myFinal, then they will be up-to-date and not visible prior to the assignment (in my example above). However, if elements are assigned to the list after the constructor is called then all bets are off and the answer is "it depends."
eqbridges
@irreputable: not quite -- in my example above there's a case where localMap is not visible to other threads prior to the assignment. please clarify. thanks!
eqbridges
@eqbridges Thanks, do you have a reference in the Java spec or similar that I could look at which confirms the 'yes' answer? Section 17.5.1 seems to define things formally, but I can't quite parse the parts about the dereference chain and memory chain.
mattbh
@mattbh: unfortunately i do not have a reference; however i induce this from java's guarantee of atomic assignments of references. so if you do all assignments prior to the final variable being visible, i assume you will be okay. unfortunately, i don't think that's as airtight as you'd like though....
eqbridges
@mattbh: Actually in this example you can initialize `myFinal` on the very first line of the constructor and populate it directly. What's important is that the new object of MyClass ( and object graph that it contains ) will not become visible to other threads until the constructor exits. After the constructor other threads will have to climb the memory barrier because they will see the references for the first time as specified in section 17.5. In other words, if you don't modify non-final fields after the constructor all threads see the same value until the object is GCed.
Alexander Pogrebnyak
@alexander: Java's atomic assignment of object references can give a false sense of security because the compiler has so much freedom to reorder unless there's explicit synchronization. The FinalFieldExample in section 17.5 shows the pathological case, where one thread sees a reference to an object in a partially initialised state (because of non-final fields). When you say "other threads will have to climb the memory barrier", is memory barrier actually established between the threads with a "happens-before" relationship like it is for volatile/synchronized? It's not clear in the spec.
mattbh
@alexander, @eqbridges: The part about atomic assignment above was in reply to eqbridges (can't seem to edit the comment)
mattbh
@mattbh: please note a critical difference between my example and FinalFieldException: in my example all initialization of the List is done to local variables in the constructor. these values are not visible until both the assignment to the final field is done and the constructor exits. Therefore any reordering done by the compiler will not vary what is visible. In FFE, you're dealing with a different scenario: two instance variables, one final are initialized. Both are visible outside the class, one prior to constructor completion. (cont'd in next comment)
eqbridges
(cont'd from previous comment)My point was simply that if you fully initialize the collection as local variables before assigning it to myFinal, then visibility (via myFinal) will be a matter of simply assigning the reference of the local Map to the instance variable Map. Since assignment of references is atomic, and because the Map was fully initialized (while as a local reference), then the reference visible via myFinal will be up-to-date.
eqbridges
@mattbh: Incidentally, the class FinalFieldExample is a good example of bad practice: exposing a static field shared across multiple methods on a class with no means of synchronizing the access. This is why the issue arises where the value of `f.y` is indeterminate. Usually, when you must expose a static member in the way illustrated in FinalFieldExample, there will be significant coding around ensuring that access to the static member is ordered.
eqbridges
@eqbridges My concern would be around all the mutable state hidden within the implementations of HashMap and ArrayList. While another thread is guaranteed to see the correct reference to the HashMap, it's not clear in the spec whether the instance variables within HashMap will be up to date when they're read from another thread, since the JVM is free to reorder assignments, except when constrained explicitly via final/volatile/synchronized etc.
mattbh
@mattbh: because the map was fully initialized as local variables before assigning to the final variable, all elements of these collections will be fully initialized upon exit of the constructor. If not they would be GC'd upon CTOR exit.
eqbridges
@eqbridges: I think that the formal definition given in 17.5.1 ("Given a write w...") is saying that examples like yours are safe, because the thread constructs the full object graph then writes it to a final field. It's all defined in terms of 2 partial orders (deferences and memory-chain) which are tricky to understand. This 3-page PDF helps: http://bit.ly/cG5SIa . What's isn't clear to me is what happens if mutable state is passed into the constructor, and used as an entry in the map.
mattbh
@mattbh: if you pass mutable state into a constructor there is no way to guarantee its consistency in a multithreaded environment. make the data immutable.
eqbridges
@eqbridges: Agreed, preferring immutability is the way to go. Unfortunately I'm having to use lots of third-party mutable types--but it seems that using final fields can in fact provide some ordering guarantees if the values are accessed via final fields. All's well that ends well!
mattbh