views:

293

answers:

2

Suppose I have a static complex object that gets periodically updated by a pool of threads, and read more or less continually in a long-running thread. The object itself is always immutable and reflects the most recent state of something.

class Foo() { int a, b; }
static Foo theFoo;
void updateFoo(int newA, int newB) {
  f = new Foo();
  f.a = newA;
  f.b = newB;
  // HERE
  theFoo = f;
}
void readFoo() {
  Foo f = theFoo;
  // use f...
}

I do not care in the least whether my reader sees the old or the new Foo, however I need to see a fully initialized object. IIUC, The Java spec says that without a memory barrier in HERE, I may see an object with f.b initialized but f.a not yet committed to memory. My program is a real-world program that will sooner or later commit stuff to memory, so I don't need to actually commit the new value of theFoo to memory right away (though it wouldn't hurt).

What do you think is the most readable way to implement the memory barrier ? I am willing to pay a little performance price for the sake of readability if need be. I think I can just synchronize the assignment to Foo and that would work, but I'm not sure it's very obvious to someone reading the code why I do that. I could also synchronize the whole initialization of the new Foo, but that would introduce more locking that actually needed.

How would you write it so that it's as readable as possible ?
Bonus kudos for a Scala version :)

+17  A: 
andersoj
You are right, it can and it is actually done in a constructor in my actual code. I was not aware there were any specific publication rules for constructors, so I will search in that direction. Thanks for the tip :)
Jean
So I may be wrong about special treatment for constructors. I believe this special treatment *only* applies to `final` fields (which I think yours would be in this case, though you haven't made that explicit). See http://www.cs.umd.edu/~pugh/java/memoryModel/jsr133.pdf, all the references to constructors are in the context of initializing `final` fields.
andersoj
Whoa. Thanks a lot for the lot of research and the detailed and easy to understand explanation. Now I know how I should write my code. Thank you very much.
Jean
Just for completeness, I'll add that if you're trying to maximise readability, I'd forget all of this, and just synchronize the two methods.
Burleigh Bear
I dont get it. Why dont you just declare theFoo volatile. You can then get rid of synchronized. You will also save time because a volatile write is about 1/3 the cost of a monitor enter/exit. I see it in your answer but I would argue it would be more readable.
John V.
@John V: My answer includes the "just use volatile" answer, but I think many would disagree that it is more readable. In my experience, `volatile` is a much-misunderstood corner of Java, requires careful analysis to ensure it's correct, and may easily get lost as the class is evolved over time. I tried to give the answer to the question, but also note that `synchronized` will be understood by more folks, is durable over a larger range of uses, and is *probably* preferred unless there is a well-understood performance driver.
andersoj
@John V: JCiP discourages use of volatile except when they "simplify implement[ation] and verif[ication]", and points the implementer towards Atomic variables, which can "often be used as better volatile variables." (JCiP 3.1.1) While I don't normally appeal to authority, JCiP has a sterling reputation, and I tend to defer unless I have concrete evidence to the contrary.
andersoj
Also note that `volatile` only works reliably in the specific case of immutable objects. While I was pushing the questioner in this direction, I'm not sure we can make that assumption... if the `theFoo` is actually mutable or becomes so in the future, `volatile` will not guarantee ordering, consistency, or visibility of future writes to its fields.
andersoj
@andersoj I agree that it would be better to use an AtomicReference. However the volatile keyword since 1.5 ensures a happens-before ordering. It will guarantee ordering, consistency and visibility just as much as an AtomicReference would. I was arguing that in your example, keeping all other code the same, declaring the field volatile would in my opinion be a better implementation.
John V.
@andersoj For readability you can make that argument. It is more obvious that the variable theFoo is guarded against whatever object its synchronizing on. But I guess if you dont have acceptable documentation on the field itself then it too can be lost in the future. One of the biggest points in JCiP is to document thread safety.
John V.
Accidentally deleted this comment - @andersoj Last point, I missed one of the last things your wrote. Today, most processors offer volatile reads to be close to a normal read. If we are going for pure throughput, and there are a lot of reads, the difference is pretty substantial.
John V.
@John V: That's good information... I wonder if there's a source that quantifies that somehow... (off to google scholar...)
andersoj
@andersoj your argument that volatile only works correctly with immutable objects is not correct (it was under the old JMM but under the latest JMM it isn't - see 17.4.7) – all writes that are before the volatile-write happen-before the volatile read. As I said elsewhere final is pretty much always preferable anyway though for its additional guarantees.
Jed Wesley-Smith
@Jed Wesley-Smith agreed, I think we crossed paths as I was responding... can you read the new text and see if I've got it right now?
andersoj
JMM: http://java.sun.com/docs/books/jls/third_edition/html/memory.html
Jed Wesley-Smith
Y, particularly 17.4.4
andersoj
@andersoj all looks good now, great answer.
Jed Wesley-Smith
+3  A: 

Having an immutable Foo with final a and b fields solves the visibility issues with the default values, but so does making theFoo volatile.

Personally I like having immutable value classes anyway as they much harder to misuse.

Jed Wesley-Smith
@Jed Wesley-Smith: See my answer--I believe in this case you have to do both to get the desired guarantees.
andersoj
@andersoj a write to a volatile guarantees a happens-before relationship (the Foo is safely published). This means that everything that happened-before the write to theFoo will be seen if the new foo value is available. As I said, the immutable Foo class is preferable as it guarantees nobody will be able to change the state or accidentally re-order the code and change the volatile write semantics, but volatile is enough on its own.
Jed Wesley-Smith
@andersj also, he specifically states that the requirement is for a properly constructed Foo, not necessarily the latest one, so volatile is not necessary for the latest object guarantee (visibility of any particular write), therefore an immutable Foo with a static initializer to create a non-null initial theFoo will guarantee a non-null theFoo at all times.
Jed Wesley-Smith
@Jed Wesley-Smith: So I think you've convinced me, though I have to confess it's really hard to dig this out of the available documentation... I'm writing up another change to my answer to capture what I think I've learned...
andersoj
@Jed yes re: the "volatile is not necessary" except that without it, the latency to see updates from other threads is completely unbounded. While the questioner doesn't care exactly which update is seen, he does care that updates are eventually seen. My understanding is that, without volatile or any other edge, the reader thread is free to cache the value effectively forever, yes?
andersoj
You might want to comment here: http://stackoverflow.com/questions/3488703/when-exactly-do-you-use-the-volatile-keyword-in-java/3488824#3488824
andersoj
Is immediate commit to memory necessary ? Yes, and no. @andersoj, you are perfectly right in stating that in absence of any other edge, the other threads are never guaranteed to see any change. On the other hand, as I specified in the original question, all my threads do encounter memory barriers fairly commonly, and they will shortly commit to memory *anyway*, always fast enough for my needs. So for my functionality, it is not *needed*, though the code in the question doesn't make it obvious. However... it's more readable and less error-prone to make it volatile anyway. So I'm doing it.
Jean
And then again, while I'm not very fond of the syntax of AtomicReference, it has two very definite advantages over a simply volatile field : 1. it makes it unmistakably obvious this is a lock-free shared object. `AtomicReference` is pretty explicit in that I'm building, well, an atomic reference. 2. It forces client code to use get() and anyone sane will cache the value in a local variable. I can imagine client code accessing theFoo.a and theFoo.b separately, and getting values that are both correct, but don't agree with each other. I don't see theFoo.get().a and theFoo.get().b used as much.
Jean
@Jean: Note difference between your thread committing to memory and other threads flushing their cache and reading from memory. Both have to occur -- and in the case of the read-side, it is dangerous to rely on this occurring unless those threads have some other interaction with this reference/object... your situation is very close to the canonical "how do I stop a thread?" question, e.g. here http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile
andersoj
Ah, yes you're right, I had not thought of that. Happens, my reader threads also have memory barriers. But it's nice to point it out. Anyway, no problem with the `AtomicReference` or with the volatile solution.
Jean