views:

2728

answers:

8

There is a case where a map will be constructed, and once it is initialized, it will never be modified again. It will however, be accessed (via get(key) only) from multiple threads. Is it safe to use a java.util.HashMap in this way?

(Currently, I'm happily using a java.util.concurrent.ConcurrentHashMap, and have no measured need to improve performance, but am simply curious if a simple HashMap would suffice. Hence, this question is not "Which one should I use?" nor is it a performance question. Rather, the question is "Would it be safe?")

A: 

http://www.docjar.com/html/api/java/util/HashMap.java.html

here is the source for HashMap. As you can tell, there is absolutely no locking / mutex code there.

This means that while its okay to read from a HashMap in a multithreaded situation, I'd definitely use a ConcurrentHashMap if there were multiple writes.

Whats interesting is that both the .NET HashTable and Dictionary<K,V> have built in synchronization code.

FlySwat
I think there are classes where simply reading concurrently can get you into trouble, because of internal use of temporary instance variables, for example. So one probably needs to carefully examine source, more than a quick scan for locking / mutex code.
Dave L.
+2  A: 

After a bit more looking, I found this in the java doc (emphasis mine):

Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.)

This seems to imply that it will be safe, assuming the converse of the statement there is true.

Dave L.
While this is excellent advice, as other answers state, there is a more nuanced answer in the case of an immutable, safely published map instance. But you should do that only if You Know What You're Doing.
Alex Miller
Hopefully with questions like these more of us can Know What We're Doing.
Dave L.
A: 

Be warned that even in single-threaded code, replacing a ConcurrentHashMap with a HashMap may not be safe. ConcurrentHashMap forbids null as a key or value. HashMap does not forbid them (don't ask).

So in the unlikely situation that your existing code might add a null to the collection during setup (presumably in a failure case of some kind), replacing the collection as described will change the functional behaviour.

That said, provided you do nothing else concurrent reads from a HashMap are safe.

[Edit: by "concurrent reads", I mean that there are not also concurrent modifications.

Other answers explain how to ensure this. One way is to make the map immutable, but it's not necessary. For example, the JSR133 memory model explicitly defines starting a thread to be a synchronised action, meaning that changes made in thread A before it starts thread B are visible in thread B.

My intent is not to contradict those more detailed answers about the Java Memory Model. This answer is intended to point out that even aside from concurrency issues, there is at least one API difference between ConcurrentHashMap and HashMap, which could scupper even a single-threaded program which replaced one with the other.]

Steve Jessop
Thanks for the warning, but there's no attempts to use null keys or values.
Dave L.
Thought there wouldn't be. nulls in collections are a crazy corner of Java.
Steve Jessop
I don't agree with this answer. "Concurrent reads from a HashMap are safe" by itself is incorrect. It doesn't state whether the reads are occurring against a map that is mutable or immutable. To be correct it should read "Concurrent reads from an immutable HashMap are safe"
Taylor Gautier
Not according to the articles you yourself linked to: the requirement is that the map must not be changed (and previous changes must be visible to all the reader threads), not that it be immutable (which is a technical term in Java and is a sufficient but not necessary condition for safety).
Steve Jessop
Anyway, I've expanded on my point, which is that although concurrent reads are safe in the absence of concurrent writes, the two classes don't have the same single-threaded semantics.
Steve Jessop
+19  A: 

Jeremy Manson, the god when it comes to the Java Memory Model, has a three part blog on this topic - because in essence you are asking the question "Is it safe to access an immutable HashMap" - the answer to that is yes. But you must answer the predicate to that question which is - "Is my HashMap immutable". The answer might surprise you - Java has a relatively complicated set of rules to determine immutability.

For more info on the topic, read Jeremy's blog posts:

Part 1 on Immutability in Java: http://jeremymanson.blogspot.com/2008/04/immutability-in-java.html

Part 2 on Immutability in Java: http://jeremymanson.blogspot.com/2008/07/immutability-in-java-part-2.html

Part 3 on Immutability in Java: http://jeremymanson.blogspot.com/2008/07/immutability-in-java-part-3.html

Taylor Gautier
It's a good point, but I'm relying on static initialization, during which no references escape, so it should be safe.
Dave L.
+8  A: 

The reads are safe from a synchronization standpoint but not a memory standpoint. This is something that is widely misunderstood among Java developers including here on Stackoverflow. (Observe the rating of this answer for proof.)

If you have other threads running, they may not see an updated copy of the HashMap if there is no memory write out of the current thread. Memory writes occur through the use of the synchronized or volatile keywords, or through uses of some java concurrency constructs.

See Brian Goetz's article on the new Java Memory Model for details.

Heath Borders
Sorry for the dual submission Heath, I only noticed yours after I submitted mine. :)
Alexander
I'm just glad there are other people here that actually understand the memory-effects.
Heath Borders
Indeed, though no thread will see the object before it is initialized properly, so I don't think that is a concern in this case.
Dave L.
That depends entirely on how the object is initialised.
Bill Michell
+2  A: 

There is an important twist though. It's safe to access the map, but in general it's not guaranteed that all threads will see exactly the same state (and thus values) of the HashMap. This might happen on multiprocessor systems where the modifications to the HashMap done by one thread (e.g., the one that populated it) can sit in that CPU's cache and won't be seen by threads running on other CPUs, until a memory fence operation is performed ensuring cache coherence. The Java Language Specification is explicit on this one: the solution is to acquire a lock (synchronized (...)) which emits a memory fence operation. So, if you are sure that after populating the HashMap each of the threads acquires ANY lock, then it's OK from that point on to access the HashMap from any thread until the HashMap is modified again.

Alexander
I'm not sure that thread accessing it will acquire any lock, but I am sure they won't get a reference to the object until after it has been initialized, so I don't think they can have a stale copy.
Dave L.
@Alex: The reference to the HashMap can be volatile to create the same memory visibility guarantees. @Dave: It *is* possible to see references to new objs before the work of its ctor becomes visible to your thread.
Christian Vest Hansen
@Christian In the general case, certainly. I was saying that in this code, it isn't.
Dave L.
+1  A: 

One note is that under some circumstances, a get() from an unsynchronized HashMap can cause an infinite loop. This can occur if a concurrent put() causes a rehash of the Map.

http://lightbody.net/blog/2005/07/hashmapget_can_cause_an_infini.html

Alex Miller
Actually I have seen this hang the JVM without consuming CPU (which is perhaps worse)
Peter Lawrey
+1  A: 

So the scenario you're describing is that you need to put a bunch of data into a Map, then when you're done populating it you treat it as immutable. One approach that is "safe" (meaning you're enforcing that it really is treated as immutable) is to replace the reference with Collections.unmodifiableMap(originalMap) when you're ready to make it immutable.

For an example of how badly maps can fail if used concurrently, and the suggested workaround I mentioned, check out this bug parade entry: bug_id=6423457

Will
This is "safe" in that it enforces the immutability, but it doesn't address the thread safety issue. If the map is safe to access with the UnmodifiableMap wrapper then it is safe without it, and vice versa.
Dave L.