views:

420

answers:

5

Hi,

I've got some questions about Java's assigment.

  • Strings

I've got a class:

public class Test {
 private String s;

 public synchronized void setS(String str){
  s = s + " - " + str;
 }

 public String getS(){
  return s;
 }
}

I'm using "synchronized" in my setter, and avoiding it in my getter, because in my app, there are a tons of data gettings, and very few settings. Settings must be synchronized to avoid inconsistency. My question is: is getting and setting a variable atomic? I mean, in a multithreaded environment, Thread1 is about to set variable s, while Thread2 is about to get "s". Is there any way the getter method could get something different than the s's old value or the s's new value (suppose we've got only two threads)? In my app it is not a problem to get the new value, and it is not a problem to get the old one. But could I get something else?

  • What about HashMap's getting and putting?

considering this:

    public class Test {
        private Map<Integer, String> map = Collections.synchronizedMap(new HashMap<Integer, String>());

        public synchronized void setMapElement(Integer key, String value){
         map.put(key, value);
        }

        public String getValue(Integer key){
         return map.get(key);
        }
}

Is putting and getting atomic? How does HashMap handle putting an element into it? Does it first remove the old value and put the now one? Could I get other than the old value or the new value?

Thanks in advance!

+6  A: 

In the first case, String happens to be safe for unsafe publication (in the "new" Java Memory Model (JMM)), so this is okay.

Not being volatile there is theoretically some issue with not having an up to date value, but then the meaning of up to date is not clear. You could replace the lock with a compare-sand-swap (CAS) loop, but that probably wouldn't give you much performance gain whether the lock was likely to be contended or not.

In the case of HashMap, an unsynchronized map is not safe to read if there is another thread writing to it, even a single writer thread. In fact, this has been found to lead to infinite loops on production systems running popular software. The code in the question actually uses two locks for the map, which is over the top (although you'd need an explicit hold of the same lock if using an iterator). Not being final does stop the containing class from being safe for unsafe publication. If map was volatile and you created a new map for every put, then that could be made to be safe without synchronisation on the get.

Tom Hawtin - tackline
I'm using Collections.synchronizedMap. Is it also dangerous? I'm making structural modification only in my setters. So I synchronize these methods, but my getters just get the data. Should I synchronize them, too? I'm afraid that while I'm setting a value in the HashMap another thread would like to get the value, and it gets something else than the old value or the new one.
Bob
`Collections.synchronizedMap` adds synchronisation, as its name suggests, so you don't have to. You might get a better idea what it does if you inline its code into yours and use the `HashMap` directly.
Tom Hawtin - tackline
+2  A: 

Rather than wrapping your HashMap in something to make it synchronized, consider using a java.util.concurrency.ConcurrentHashMap instead.

This is an updated version of HashMap that guarantees "Retrievals reflect the results of the most recently completed update operations holding upon their onset.".

R. Bemrose
+2  A: 

Earlier answers are correct in pointing that with new (1.5+) JVMs, String version is safe, with respect to data corruption. And you seem to be aware of downside non-synchronized access; that of changes not necessarily being visible via getters.

However: more useful question is this: is there reason to synchronization here? If this is just for being interested in knowing this, that's all good. But for real code, the general rule of reads and writes is that if there can be both, both should be synchronized. So although you can in this case leave out synchronization (potentially meaning that threads do not see changes other threads make), there seems to be little benefit in doing that.

StaxMan
My app is a "blog engine" like app. I store the first X (let's say 10) of the newly created entrys in a variable. It's is not a problem the users not get the up to date version of the entrys, because they could refresh the page if they wanted to. I make about 20 entrys in one day (setter), but I've got hundreds of thousands of visitors a day. It is not efficient (I think) to synchronize the getter methods (I've got quite a lot getter methods like the above). (No, I haven't got hundreds of thousands of visitors a day yet, but I want my app to be able to handle that)
Bob
Note that it's not just a matter of the user refreshing the page. In the Java memory model, if there's no synchronization between the setter thread and the getter thread, it's possible that the getter thread will *never* get the updated value.This sounds like a premature optimization type thing to me. Get it correct first. If you later see that synchronization is actually limiting your throughput, you can then look at better ways to handle it.
Mike McNertney
"it's possible that the getter thread will never get the updated value." How come? I just don't get it. "35 sec 300 millisec" I get the value, it is the old one. "35 sec 500 millisec" I set the value, than "50 sec xxx millisec" I try to get the new one? How come it is possible I won't get the new value ever?
Bob
Is it really possible to *never* get the updated value? This might have been true under the "old" Java memory model where a thread didn't have to flush all its writes before terminating, but the writer thread must eventually block (if it's part of a pool waiting for a new task), perform a blocking I/O operation (which implies a `wait()`, which implies a `lock`) or exit. The only way to avoid the memory barrier is to go into an infinite (CPU-bound) loop. Which means your program was badly broken already.
finnw
I agree 100% with Mike. Finnw is right that in practice there probably is a memory barrier somewhere that triggers memory flush... but keep in mind that overhead of synchronization is measured in something like nanoseconds -- it does not matter if it's tens of millions of visitors, what matters is what is slowest part. And it is all but certain that couple of low-level memory barrier - based locks is not it.So unless you can measure that change makes ANY measurable effect with a stress test, there is no point in optimizing away a non-problem, and possibly creating an actual problem.
StaxMan
A: 

In multi-threaded environment you will need to synchronize the getter in order to ensure the client sees the most up to date value of s.

Steve Kuo
making the variable "volatile" would be just as effective and not require the synchronization overhead
fuzzy lollipop
`volatile` has its own synchronization overhead built-in.
Bombe
A: 

Perhaps a Read Write Lock could solve your problem?

Take a look at its documentation:

A read-write lock allows for a greater level of concurrency in accessing shared data than that permitted by a mutual exclusion lock. It exploits the fact that while only a single thread at a time (a writer thread) can modify the shared data, in many cases any number of threads can concurrently read the data (hence reader threads). In theory, the increase in concurrency permitted by the use of a read-write lock will lead to performance improvements over the use of a mutual exclusion lock. ....

Varun Garde