What issues / pitfalls do I need to consider when overriding equals and hashCode in a java class?
If you are using Eclipse it has integrated a very cool hashCode / equals generator.
You just have to be on a class and do: right click > Source code > Generate hashCode() and equals()...
Then, a window will show up and you can choose the fields to include in your methods.
The theory (for the language lawyers and the mathematically inclined):
equals() (javadoc) must define an equality relation (it must be reflexive, symmetric, and transitive). In addition, it must be consistent (if the objects are not modified, then it must keep returning the same value). Furthermore, o.equals(null) must always return false.
hashCode() (javadoc) must also be consistent (if the object is not modified in terms of equals(), it must keep returning the same value).
The relation between the two methods is:
Whenever a.equals(b), then a.hashCode() must be same as b.hashCode().
In practice:
If you override one, then you should override the other.
Use the same set of fields that you use to compute equals() to compute hashCode().
Use the excellent helper classes EqualsBuilder and HashCodeBuilder from the Apache Commons Lang library. An example:
public class Person {
private String name;
private int age;
// ...
public int hashCode() {
return new HashCodeBuilder(17, 31). // two randomly chosen prime numbers
// if deriving: appendSuper(super.hashCode()).
append(name).
append(age).
toHashCode();
}
public boolean equals(Object obj) {
if (obj == null)
return false;
if (obj == this)
return true;
if (obj.getClass() != getClass())
return false;
Person rhs = (Person) obj;
return new EqualsBuilder().
// if deriving: appendSuper(super.equals(obj)).
append(name, rhs.name).
append(age, rhs.age).
isEquals();
}
}
Also remember:
When using a hash-based Collection or Map such as HashSet, LinkedHashSet, HashMap, Hashtable, or WeakHashMap, make sure that the hashCode() of the key objects that you put into the collection never changes while the object is in the collection. The bulletproof way to ensure this is to make your keys immutable, which has also other benefits.
@Konrad Rudolph: this is wrong, hashCode() may very well return a different value, if any information used in equals comparisons has changed. You are right in saying, that it's best to use immutable objects as Map keys. The behaviour of all Maps (not just HashMap) is not specified if you change an object, that is used as a key for a Map, in a way that its behaviour of the equals method is changed (see Javadoc for Map).
A clarification about the "obj.getClass() != getClass()".
This statement is the result of equals() being inheritance unfriendly. The JLS (Java language specification) specifies that if A.equals(B)==true then B.equals(A) must also return true. If you omit that statement inheriting classes that override equals() (and change it's behavior) will break this specification.
Consider the following example of what happens when the statement is omitted:
class A {
int field1;
A(int field1) {
this.field1 = field1;
}
public boolean equals(Object other) {
return (other != null && other instanceof A && ((A) other).field1 == field1);
}
}
class B extends A {
int field2;
B(int field1, int field2) {
super(field1);
this.field2 = field2;
}
public boolean equals(Object other) {
return (other != null && other instanceof B && ((B)other).field2 == field2 && super.equals(other));
}
}
Doing new A(1).equals(new A(1)) Also, new B(1,1).equals(new B(1,1)) result give out true, as it should.
This looks all very good, but look what happens if we try to use both classes:
A a = new A(1);
B b = new B(1,1);
a.equals(b) == true;
b.equals(a) == false;
Obviously, this is wrong.
There is an Apache Commons package that provides an EqualsBuilder and HashCodeBuilder, which use the methods described in Bloch's Effective Java, so that instead of re-coding the algorithm yourself for each Object you can use EqualsBuilder and HashCodeBuilder as helpers.
http://commons.apache.org/lang/userguide.html http://commons.apache.org/lang/apidocs/org/apache/commons/lang/builder/EqualsBuilder.html http://commons.apache.org/lang/apidocs/org/apache/commons/lang/builder/HashCodeBuilder.html
The Javadocs contain some examples on how to use each.
I think the project has been around for quite a while, but I'm not sure how widely used it is or if it's functionality isn't covered in more recent versions of the JDK.
One gotcha I have found is where two objects contain references to each other (one example being a parent/child relationship with a convenience method on the parent to get all children).
These sorts of things are fairly common when doing Hibernate mappings for example.
If you include both ends of the relationship in your hashCode or equals tests it's possible to get into a recursive loop which ends in a StackOverflowException.
The simplest solution is to not include the getChildren collection in the methods.
Here's the Guerilla's guide to equals() and hashCode():
- Avoid using them. The design, which favours inheritance over composition, is broken. By extension, so is the collections framework in the standard library. Don't use that either.
- Make your objects immutable and seal your classes (everything either abstract or final).
Be careful out there.
Chapter 3 of Effective Java discusses this in depth. You can read it here (PDF!)
For an inheritance-friendly implementation, check out Tal Cohen's solution: [http://www.ddj.com/java/184405053][1]
Summary:
In his book Effective Java Programming Language Guide (Addison-Wesley, 2001), Joshua Bloch claims that "There is simply no way to extend an instantiable class and add an aspect while preserving the equals contract." Tal disagrees.
His solution is to implement equals() by calling another nonsymmetric blindlyEquals() both ways. blindlyEquals() is overridden by subclasses, equals() is inherited, and never overridden.
Example:
class Point {
private int x;
private int y;
protected boolean blindlyEquals(Object o) {
if (!(o instanceof Point))
return false;
Point p = (Point)o;
return (p.x == this.x && p.y == this.y);
}
public boolean equals(Object o) {
return (this.blindlyEquals(o) && o.blindlyEquals(this));
}
}
class ColorPoint extends Point {
private Color c;
protected boolean blindlyEquals(Object o) {
if (!(o instanceof ColorPoint))
return false;
ColorPoint cp = (ColorPoint)o;
return (super.blindlyEquals(cp) &&
cp.color == this.color);
}
}
Note that equals() must work across inheritance hierarchies if the Liskov Substitution Principle is to be satisfied.
The first question you should ask is do you really need to? java.lang.Object has implementations of these methods that are sufficient for usage as hashtable keys.
Make sure you produce a reasonably pseudo-random distribution of hashCodes otherwise you may end up with a lot of hash table entries in the same bucket and your performance will suffer. One simple technique I have sometimes used is to create a String representation of the object and return the hashCode of that.
There are some issues worth noticing if you're dealing with classes that are persisted using an Object-Relationship Mapper (ORM) like Hibernate. If you didn't think this was unreasonably complicated already!
Lazy loaded objects are subclasses
If your objects are persisted using an ORM, in many cases you will be dealing with dynamic proxies to avoid loading object too early from the data store. These proxies are implemented as subclasses of your own class. This means thatthis.getClass() == o.getClass()
will return false. For example:
Person saved = new Person("John Doe");
Long key = dao.save(saved);
dao.flush();
Person retrieved = dao.retrieve(key);
saved.getClass().equals(retrieved.getClass()); // Will return false if Person is loaded lazy
If you're dealing with an ORM using o instanceof Person
is the only thing that will behave correctly.
Lazy loaded objects have null-fields
ORMs usually use the getters to force loading of lazy loaded objects. This means that person.name
will be null if person
is lazy loaded, even if person.getName()
forces loading and returns "John Doe". In my experience, this crops up more often in hashCode
and equals
.
If you're dealing with an ORM, make sure to always use getters, and never field references in hashCode
and equals
.
Saving an object will change it's state
Persistent objects often use a id
field to hold the key of the object. This field will be automatically updated when an object is first saved. Don't use an id field in hashCode
. But you can use it in equals
.
A pattern I often use is
if (this.getId() == null) {
return this == other;
} else {
return this.getId() == other.getId();
}
But: You cannot include getId()
in hashCode()
. If you do, when an object is persisted, it's hashCode
changes. If the object is in a HashSet
, you'll "never" find it again.
In my Person
example, I probably would use getName()
for hashCode
and getId
plus getName()
(just for paranoia) for equals
. It's okay if there are some risk of "collisions" for hashCode
, but never okay for equals
.
hashCode
should use the non-changing subset of properties from equals
For equals, look into Secrets of Equals by Angelika Langer. I love it very much. She's also a great FAQ about Generics in Java. View her other articles here (scroll down to "Core Java"), where she also goes on with Part-2 and "mixed type comparison". Have fun reading them!
another useful link which i went through Use of hashcode and equals
I have an abstract test case which I can use to test an object's equal/hashCode methods. This test will attempt to verify that your equals method is reflexive, symmetric, and transitive; and your hashCode is consistent. To build a unit test, simply extend the EqualityTestCase class. For example, to test a Point class:
public class PointTest extends EqualityTestCase<Point> {
public Point getA() { return new Point(0,0); }
public Point getB() { return new Point(1,1); }
}
Additionally, the class will test other common contracts such as Serializable, Comparable and clone.