tags:

views:

486

answers:

8
+5  Q: 

Double in HashMap

I was thinking of using a Double as the key to a HashMap but I know floating point comparisons are unsafe, that got me thinking. Is the equals method on the Double class also unsafe? If it is then that would mean the hashCode method is probably also incorrect. This would mean that using Double as the key to a HashMap would lead to unpredictable behavior.

Can anyone confirm any of my speculation here?

A: 

The hash of the double is used, not the double itself.

Edit: Thanks, Jon, I actually didn't know that.

I'm not sure about this (you should just look at the source code of the Double object) but I would think any issues with floating point comparisons would be taken care of for you.

Alex Beardsley
No, the hash is used *initially* to find the bucket, then equals would be used.
Jon Skeet
The hash is used to find the 'bucket' the list of values with an equal hash are in. Once the bucket is found the key-value pairs are iterated over looking for a key that is 'equal' and its corresponding value returned.
Nick Holt
hashCode and equals on Double use a IEE 754 floating point bit representation of the number
pjp
Two unequal objects can have the same hashcode and thus, maps tend to double-check keys with equals, so it's quite important.
Steve Reed
+2  A: 

It depends on how you would be using it.

If you're happy with only being able to find the value based on the exact same bit pattern (or potentially an equivalent one, such as +/- 0 and various NaNs) then it might be okay.

In particular, all NaNs would end up being considered equal, but +0 and -0 would be considered different. From the docs for Double.equals:

Note that in most cases, for two instances of class Double, d1 and d2, the value of d1.equals(d2) is true if and only if

d1.doubleValue() == d2.doubleValue() also has the value true. However, there are two exceptions:

  • If d1 and d2 both represent Double.NaN, then the equals method returns true, even though Double.NaN==Double.NaN has the value false.
  • If d1 represents +0.0 while d2 represents -0.0, or vice versa, the equal test has the value false, even though +0.0==-0.0 has the value true.

This definition allows hash tables to operate properly.

Most likely you're interested in "numbers very close to the key" though, which makes it a lot less viable. In particular if you're going to do one set of calculations to get the key once, then a different set of calculations to get the key the second time, you'll have problems.

Jon Skeet
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Double.html confirms that equals() compares doubleToLongBits(double).
Andrew Keeton
And in particular it talks about the NaN and +/-0 issues :)
Jon Skeet
+5  A: 

I think you are right. Although the hash of the doubles are ints, the double could mess up the hash. That is why, as Josh Bloch mentions in Effective Java, when you use a double as an input to a hash function, you should use doubleToLongBits(). Similarly, use floatToIntBits for floats.

In particular, to use a double as your hash, following Josh Bloch's recipe, you would do:

public int hashCode() {
  int result = 17;
  long temp = Double.doubleToLongBits(the_double_field);
  result = 37 * result + ((int) (temp ^ (temp >>> 32)));
  return result;
}

This is from Item 8 of Effective Java, "Always override hashCode when you override equals". It can be found in this pdf of the chapter from the book.

Hope this helps.

Tom
Double's hash code already uses that exact method, except for the 17 and 37 parts. "long bits = doubleToLongBits(value); return (int)(bits ^ (bits >>> 32));"
Michael Myers
A: 

It depends on how you store and access you map, yes similar values could end up being slightly different and therefore not hash to the same value.

private static final double key1 = 1.1+1.3-1.6;
private static final double key2 = 123321;
...
map.get(key1);

would be all good, however

map.put(1.1+2.3, value);
...
map.get(5.0 - 1.6);

would be dangerous

David Waters
+3  A: 

Short answer: Don't do it

Long answer: Here is how the key is going to be computed:

The actual key will be a java.lang.Double object, since keys must be objects. Here is its hashCode() method:

public int hashCode() {
  long bits = doubleToLongBits(value);
  return (int)(bits ^ (bits >>> 32));
}

The doubleToLongBits() method basically takes the 8 bytes and represent them as long. So it means that small changes in the computation of double can mean a great deal and you will have key misses.

If you can settle for a given number of points after the dot - multiply by 10^(number of digits after the dot) and convert to int (for example - for 2 digits multiply by 100).

It will be much safer.

David Rabinowitz
+1  A: 

The problem is not the hash code but the precision of the doubles. This will cause some strange results. Example:

    double x = 371.4;
    double y = 61.9;
    double key = x + y;    // expected 433.3

    Map<Double, String> map = new HashMap<Double, String>();
    map.put(key, "Sum of " + x + " and " + y);

    System.out.println(map.get(433.3));  // prints null

The calculated value (key) is "433.29999999999995" which is not EQUALS to 433.3 and so you don't find the entry in the Map (the hash code probably is also different, but that is not the main problem).

If you use

map.get(key)

it should find the entry... []]

Carlos Heuberger
Since the hashCode might be different for two very similar numbers you might not even be looking in the right bucket.
anio
I wrote that, didn't I? but that is not the problem since equals will return false anyway (also if the numbers are *very* *very* similar)
Carlos Heuberger
+1  A: 

Maybe BigDecimal get you where you want to go?

Brian T. Grant
A: 

Short answer: It probably won't work.

Honest answer: It all depends.

Longer answer: The hash code isn't the issue, it's the nature of equal comparisons on floating point. As Nalandial and the commenters on his post point out, ultimately any match against a hash table still ends up using equals to pick the right value.

So the question is, are your doubles generated in such a way that you know that equals really means equals? If you read or compute a value, store it in the hash table, and then later read or compute the value using exactly the same computation, then Double.equals will work. But otherwise it's unreliable: 1.2 + 2.3 does not necessarily equal 3.5, it might equal 3.4999995 or whatever. (Not a real example, I just made that up, but that's the sort of thing that happens.) You can compare floats and doubles reasonably reliably for less or greater, but not for equals.

Jay