views:

592

answers:

2

Is there any better way to cache up some very large objects, that can only be created once, and therefore need to be cached ? Currently, I have the following:

public enum LargeObjectCache {  
    INSTANCE; 

    private Map<String, LargeObject> map = new HashMap<...>();

    public LargeObject get(String s) {  
        if (!map.containsKey(s)) {
            map.put(s, new LargeObject(s));
        }
        return map.get(s);
    }
}

There are several classes that can use the LargeObjects, which is why I decided to use a singleton for the cache, instead of passing LargeObjects to every class that uses it.

Also, the map doesn't contain many keys (one or two, but the key can vary in different runs of the program) so, is there another, more efficient map to use in this case ?

+3  A: 

You may need thread-safety to ensure you don't have two instance of the same name. It does matter much for small maps but you can avoid one call which can make it faster.

public LargeObject get(String s) {  
    synchronized(map) {
        LargeObject ret = map.get(s);
        if (ret == null) 
            map.put(s, ret = new LargeObject(s));
        return ret;
    }
}
Peter Lawrey
You're right, thanks. On the javadoc however, it mentions this about thread-safeness: "This is typically accomplished by synchronizing on some object that naturally encapsulates the map. If no such object exists, the map should be "wrapped" using the Collections.synchronizedMap method.". So, what are the differences between using synchronized(map), and using Collections.synchronizedMap(map), or even using a ConcurrentHashMap<String,LargeObject> ? They all seem to have the same goal, but I suppose there must be differences.
JG
If you use `synchronizedMap` (or even a `CuncurrentMap`) then you might end up creating the object twice. Instead of locking the whole map for the duration of your `LargeObject` creation, you could lock for a particular key (say using a `Future`). However, the code gets more complicated.
Tom Hawtin - tackline
But if there needs to be atomically created instances on demand, concurrent compound operations of putIfAbsend() is very costly on large objects.
kd304
If you use a synchronizedMap() you can have a race condition where two or more threads lock the map.get() and multiple threads perform a map.put() Each operation is thread safe, but not as a pair of operations.
Peter Lawrey
+2  A: 

As it has been pointed out, you need to address thread-safety. Simply using Collections.synchronizedMap() doesn't make it completely correct, as the code entails compound operations. Synchronizing the entire block is one solution. However, using ConcurrentHashMap will result in a much more concurrent and scalable behavior if it is critical.

public enum LargeObjectCache {  
    INSTANCE; 

    private final ConcurrentMap<String, LargeObject> map = new ConcurrentHashMap<...>();

    public LargeObject get(String s) {
        LargeObject value = map.get(s);
        if (value == null) {
            value = new LargeObject(s);
            LargeObject old = value.putIfAbsent(s, value);
            if (old != null) {
                value = old;
            }
        }
        return value;
    }
}

You'll need to use it exactly in this form to have the correct and the most efficient behavior.

If you must ensure only one thread gets to even instantiate the value for a given key, then it becomes necessary to turn to something like the computing map in Google Collections or the memoizer example in Brian Goetz's book "Java Concurrency in Practice".

sjlee