tags:

views:

116

answers:

5

I have a class Foo which overrides equals() and hashCode() properly.

I would like to also would like to use a HashSet<Foo> to keep track of "canonical values" e.g. I have a class that I would like to write like this, so that if I have two separate objects that are equivalent I can coalesce them into references to the same object:

class Canonicalizer<T>
{
    final private Set<T> values = new HashSet<T>(); 

    public T findCanonicalValue(T value)
    {
        T canonical = this.values.get(value);
        if (canonical == null)
        {
           // not in the set, so put it there for the future
           this.values.add(value);
           return value;
        }
        else
        {
           return canonical;
        }
    }
}

except that Set doesn't have a "get" method that would return the actual value stored in the set, just the "contains" method that returns true or false. (I guess that it assumes that if you have an object that is equal to a separate object in the set, you don't need to retrieve the one in the set)

Is there a convenient way to do this? The only other thing I can think of is to use a map and a list:

class Canonicalizer<T>
{
    // warning: neglects concurrency issues

    final private Map<T, Integer> valueIndex = new HashMap<T, Integer>(); 
    final private List<T> values = new ArrayList<T>();

    public T findCanonicalValue(T value)
    {
        Integer i = this.valueIndex.get(value);
        if (i == null)
        {
           // not in the set, so put it there for the future
           i = this.values.size();
           this.values.add(value);
           this.valueIndex.put(value, i);
           return value;
        }
        else
        {
           // in the set
           return this.values.get(i);
        }
    }
}
+3  A: 

I would use just a HashMap like this:

final private Map<T, T> values = new HashMap<T, T>(); 
public T findCanonicalValue(T value)
{
    T canon = this.values.get(value);
    if (canon == null)
    {
       // not in the set, so put it there for the future
       this.values.put(value, value);
       return value;
    }
    else
    {
       // in the set
       return canon;
    }
}

its still a little awkard but it should be slightly more efficient than the map and the list (one fewer data structure).

luke
+1  A: 

A map seems the right idea, but why not map from some Object T to the corresponding canonical object?

class Canonicalizer<T>
{
    final private Map<T, T> values = new HashMap<T, T>(); 

    public T findCanonicalValue(T value)
    {
        T canonical = this.values.get(value);
        if (canonical == null)
        {
            // not in the map, so put it there for the future
            this.values.put(value, value);
            return value;
        }
        else
        {
            return canonical;
        }
    }
}
Jörn Horstmann
+2  A: 

With ConcurrentMap it is a little simpler. (And thread safe)

private final ConcurrentMap<T, T> values = new ConcurrentHashMap<T, T>();  
public T findCanonicalValue(T value) { 
    values.putIfAbsent(value, value);
    return values.get(value);
}
Peter Lawrey
+1  A: 

In many cases, you want this sort of container for an object cache that needs to be accessed by different threads; if so, make sure not to neglect concurrency issues, and be aware that the thread-safe map implementations generally don't support null keys. If for some reason you can't use Guava, at least take a look at the source code:

http://www.google.com/codesearch/p?hl=en#UKMs0lhE9bg/trunk/src/com/google/common/collect/Interners.java

public final class Interners {
  private Interners() {}

  /**
   * Returns a new thread-safe interner which retains a strong reference to
   * each instance it has interned, thus preventing these instances from being
   * garbage-collected. If this retention is acceptable, this implementation may
   * perform better than {@link #newWeakInterner}.
   */
  public static <E> Interner<E> newStrongInterner() {
    final ConcurrentMap<E, E> map = new MapMaker().makeMap();
    return new Interner<E>() {
      public E intern(E sample) {
        E canonical = map.putIfAbsent(checkNotNull(sample), sample);
        return (canonical == null) ? sample : canonical;
      }
    };
  }
Alex M.