tags:

views:

4121

answers:

14

How do we decide on the best implementation of hashcode method for a collection?(assuming that equals method has been overridden correctly)

A: 

For a simple class it is often easiest to implement hashCode() based on the class fields which are checked by the equals() implementation.

public class Zam {
    private String foo;
    private String bar;
    private String somethingElse;

    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }

        if (obj == null) {
            return false;
        }

        if (getClass() != obj.getClass()) {
            return false;
        }

        Zam otherObj = (Zam)obj;

        if ((getFoo() == null && otherObj.getFoo() == null) || (getFoo() != null && getFoo().equals(otherObj.getFoo()))) {
            if ((getBar() == null && otherObj. getBar() == null) || (getBar() != null && getBar().equals(otherObj. getBar()))) {
                return true;
            }
        }

        return false;
    }

    public int hashCode() {
        return (getFoo() + getBar()).hashCode();
    }

    public String getFoo() {
        return foo;
    }

    public String getBar() {
        return bar;
    }
}

The most important thing is to keep hashCode() and equals() consistent: if equals() returns true for two objects, then hashCode() should return the same value. If equals() returns false, then hashCode() should return different values.

Chris Carruthers
A: 

@about8 : there is a pretty serious bug there.

Zam obj1 = new Zam("foo", "bar", "baz");
Zam obj2 = new Zam("fo", "obar", "baz");

same hashcode

you probably want something like

public int hashCode() {
    return (getFoo().hashCode() + getBar().hashCode()).toString().hashCode();

(can you get hashCode directly from int in Java these days? I think it does some autocasting.. if that's the case, skip the toString, it's ugly.)

SquareCog
I looked at the question twice but didn't find a bug there...
Huppie
the bug is in the long answer by about8.blogspot.com -- getting the hashcode from a concatenation of strings leaves you with a hash function that is the same for any combination of strings that add up to the same string.
SquareCog
So this is meta-discussion and not related to the question at all? ;-)
Huppie
It's a correction to a proposed answer that has a fairly significant flaw.
SquareCog
+4  A: 

First make sure that equals is implemented correctly. From an IBM DeveloperWorks article:

  • Symmetry: For two references, a and b, a.equals(b) if and only if b.equals(a)
  • Reflexivity: For all non-null references, a.equals(a)
  • Transitivity: If a.equals(b) and b.equals(c), then a.equals(c)

Then make sure that their relation with hashCode respects the contact (from the same article):

  • Consistency with hashCode(): Two equal objects must have the same hashCode() value

Finally a good hash function should strive to approach the ideal hash function.

Cd-MaN
A: 

Just a quick note for completing other more detailed answer (in term of code):

If I consider the question how-do-i-create-a-hash-table-in-java and especially the jGuru FAQ entry, I believe some other criteria upon which a hash code could be judged are:

  • synchronization (does the algo support concurrent access or not) ?
  • fail safe iteration (does the algo detect a collection which changes during iteration)
  • null value (does the hash code support null value in the collection)
VonC
A: 

As you specifically asked for collections, I'd like to add an aspect that the other answers haven't mentioned yet: A HashMap doesn't expect their keys to change their hashcode once they are added to the collection. Would defeat the whole purpose...

Olaf
A: 

If I understand your question correctly, you have a custom collection class (i.e. a new class that extends from the Collection interface) and you want to implement the hashCode() method.

If your collection class extends AbstractList, then you don't have to worry about it, there is already an implementation of equals() and hashCode() that works by iterating through all the objects and adding their hashCodes() together.

   public int hashCode() {
      int hashCode = 1;
      Iterator i = iterator();
      while (i.hasNext()) {
        Object obj = i.next();
        hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
      }
  return hashCode;
   }

Now if what you want is the best way to calculate the hash code for a specific class, I normally use the ^ (bitwise exclusive or) operator to process all fields that I use in the equals method:

public int hashCode(){
   return intMember ^ (stringField != null ? stringField.hashCode() : 0);
}
Mario Ortegón
A: 

any hashing method that evenly distributes the hash value over the possible range is a good implementation. See effective java ( http://books.google.com.au/books?id=ZZOiqZQIbRMC&dq=effective+java&pg=PP1&ots=UZMZ2siN25&sig=kR0n73DHJOn-D77qGj0wOxAxiZw&hl=en&sa=X&oi=book_result&resnum=1&ct=result ) , there is a good tip in there for hashcode implementation (item 9 i think...).

Chii
+18  A: 

The best implementation? That is a hard questions because it depends on the usage pattern.

A for nearly all cases reasonable good implementation was proposed in Josh Bloch's "Effective Java" in item 8. The best thing is to look it up there because the author explains there why the approach is good.

A short version:

1) Create a int result and assign a non-zero value.

2) For every field tested in the equals-Method, calculate a hash code c by:

  • If the field f is a boolean: calculate (f ? 0 : 1)
  • If the field f is a byte, char, short or int: calculate (int)f
  • If the field f is a long: calculate (int)(f ^ f( >>> 32)
  • If the field f is a float: calculate Float.floatToIntBits(f)
  • If the field f is a double: calculate Double.doubleToLongBits(f) and handle the return value like every long value
  • If the field f is an object: Use the result of the hashCode() method or 0 if f is a null referenence.
  • If the field f is an array: See every field as separate element and calculate the hash value in a recursive fashion and combine the values as described next.

3) Combine the hash value c with result with:

result = 37 * result + c

4) Return result

This should result in a proper distribution of hash values for most use situations.

dmeister
Interesting answer. Are there some link to a mathematical proof for why this formula works?
runaros
Yeah I'm particularly curious about where the number 37 comes from.
Kip
I'm not aware of any proof.The number of 37 is arbitrary, but it should be prime. Why? I'm not really sure but it has to do with modulo arthritics and properties of prime numbers which lead to go distributions.
dmeister
@dmeister: Thank you for detailed answer. Could you please tell me, which reference/guidelines you used to create this set of advices? Also I wonder, why multiply by 37? What is the magic in 37? Why multiply is better then right shift (`result = result << 16 + c`)?
dma_k
I used item 8 of Josh Bloch's "Effective Java" book.
dmeister
+1  A: 

Agree with dmeister's answer. And for that matter there's a good implementation of the "Effective Java"'s hashcode() and equals() logic in Apache Commons Lang. Checkout HashCodeBuilder and EqualsBuilder.

Rudi Adianto
+2  A: 

about8.blogspot.com, you said

if equals() returns true for two objects, then hashCode() should return the same value. If equals() returns false, then hashCode() should return different values

I can not agree with you. If two objects has the same hashcode it DOSEN'T have to mean that their are equal.

If A equlas B then A.hashcode must be equal to B.hascode

but

if A.hashcode equals B.hascode it does not mean that A must equals B

perfect..thanks for pointing out that..
Omnipotent
A: 

BTW I prefer using utility methods form Google Collections lib form class Objects that helps me to keep my code clean. Very often equals and hashcode methods are made from IDE's template so their are not clean to read.

+1  A: 

If you use eclipse, you can generate equals() and hashCode() using Source -> "Generate hashCode() and equals()...". Using this function you can decide which fields you want to use for equality and hash code calculation and Eclipse generates the corresponding methods.

jkl
+6  A: 

It is better to use the functionality provided by Eclipse which does a pretty good job and you can put your efforts and energy in developing the business logic.

Warrior