views:

563

answers:

8

I have a class with a static member like this:

class C
{
  static Map m=new HashMap();
  {
    ... initialize the map with some values ...
  }
}

AFAIK, this would consume memory practically to the end of the program. I was wondering, if I could solve it with soft references, like this:

class C
{
  static volatile SoftReference<Map> m=null;
  static Map getM() {
    Map ret;
    if(m == null || (ret = m.get()) == null) {
      ret=new HashMap();
      ... initialize the map ...
      m=new SoftReference(ret);
    }
    return ret;
  }
}

The question is

  1. is this approach (and the implementation) right?
  2. if it is, does it pay off in real situations?
A: 

How big is this map going to be ? Is it worth the effort to handle it this way ? Have you measured the memory consumption of this (for what it's worth, I believe the above is generally ok, but my first question with optimisations is "what does it really save me").

You're returning the reference to the map, so you need to ensure that your clients don't hold onto this reference (and prevent garbage collection). Perhaps your class can hold the reference, and provide a getKey() method to access the content of the map on behalf of clients ? That way you'll maintain control of the reference to the map in one place.

I would synchronise the above, in case the map gets garbage collected and two threads hit getMap() at the same time. Otherwise you're going to create two maps simultaneously!

Brian Agnew
Creating two maps simultaneously isn't a real problem. One will get thrown away. However, the threading issue may lead to infinite loops (generally bad for performance). Also making it synchrnonised could give really bad performance if highly contended on multithreaded machines.
Tom Hawtin - tackline
Isn't the problem with two maps simply the fact that 2 (or more) will exist simultaneously, and they've been identified as being quite sizable ?
Brian Agnew
+2  A: 

This is okay if your access to getM is single threaded and it only acts as a cache. A better alternative is to have a fixed size cache as this provides a consistent benefit.

Peter Lawrey
+4  A: 

First, the code above is not threadsafe.

Second, while it works in theory, I doubt there is a realistic scenario where it pays off. Think about it: In order for this to be useful, the map's contents would have to be:

  1. Big enough so that their memory usage is relevant
  2. Able to be recreated on the fly without unacceptable delays
  3. Used only at times when other parts of the program require less memory - otherwise the maximum memory required would be the same, only the average would be less, and you probably wouldn't even see this outside the JVM since it give back heap memory to the OS very reluctantly.

Here, 1. and 2. are sort of contradictory - large objects also take longer to create.

Michael Borgwardt
+1  A: 

getM() should be synchronized, to avoid m being initialized at the same time by different threads.

Bob
The question was not about thread safety...
jpalecek
The question does make a very dangerous threading error.
Tom Hawtin - tackline
@jpalecek: The question asks, "is this approach (and the implementation) right?" That's completely open-ended, and thread safety can be considered.
erickson
A: 

Maybe you are looking for WeakHashMap? Then entries in the map can be garbage collected separately.

Though in my experience it didn't help much, so I instead built an LRU cache using LinkedHashMap. The advantage is that I can control the size so that it isn't too big and still useful.

starblue
The trouble with `WeakHashMap` is that it is the **keys** which are weakly referenced, not the values. This is useful for things like saving data against an `HttpSession` but less useful for a *cache*. Also note that the JVM will collect a weakly-referenced object immediately when there are no strong references, whereas it will only collect a `SoftReference` when it is low on memory
oxbow_lakes
Aha, the aggressive garbage collection explains why it didn't work. Unfortunately there doesn't seem to be a standard SoftHashMap.
starblue
`WeakReference`s are really dangerous for caches. NetBeans, in my experience, after some use just sits there with CPU at 100% doing file access without touching the hard drive. It uses a `WeakHashMap` cache, which at some point HotSpot optimises to evict straight after caching. Probably worth making sure you have some kind of sensible performance without the cache.
Tom Hawtin - tackline
A: 

I was wondering, if I could solve it with soft references

What is it that you are trying to solve? Are you running into memory problems, or are you prematurely optimizing?

In any case,

  1. The implementation should be altered a bit if you were to use it. As has been noted, it isnt thread-safe. Multiple threads could access the method at the same time, allowing multiple copies of your collection to be created. If these collections were then strongly referenced for the remainder of your program you would end up with more memory consumption, not less

  2. A reason to use SoftReferences is to avoid running out of memory, as there is no contract other than that they will be cleared before the VM throws an OutOfMemoryError. Therefore there is no guaranteed benefit of this approach, other than not creating the cache until it is first used.

akf
A: 

The first thing I notice about the code is that it mixes generic with raw types. That is just going to lead to a mess. javac in JDK7 has -Xlint:rawtypes to quickly spot that kind of mistake before trouble starts.

The code is not thread-safe but uses statics so is published across all threads. You probably don' want it to be synchronized because the cause problems if contended on multithreaded machines.

A problem with use a SoftReference for the entire cache is that you will cause spikes when the reference is cleared. In some circumstances it might work out better to have ThreadLocal<SoftReference<Map<K,V>>> which would spread the spikes and help-thread safety at the expense of not sharing between threads.

However, creating a smarter cache is more difficult. Often you end up with values referencing keys. There are ways around this bit it is a mess. I don't think ephemerons (essentially a pair of linked References) are going to make JDK7. You might find the Google Collections worth looking at (although I haven't).

java.util.LinkedHashMap gives an easy way to limit the number of cached entries, but is not much use if you can't be sure how big the entries are, and can cause problems if it stops collection of large object systems such as ClassLoaders. Some people have said you shouldn't leave cache eviction up to the whims of the garbage collector, but then some people say you shouldn't use GC.

Tom Hawtin - tackline
A: 

You may have a look at http://blog.mycila.com/2009/08/softhashmap-and-softcache.html

Mathieu Carbou