tags:

views:

2515

answers:

4

This is odd. A co-worker asked about the implementation of myArray.hashCode() in java. I thought I knew but then I ran a few tests. Check the code below. The odd think I noticed is that when I wrote the first sys out the results were different. Note that it's almost like it's reporting a memory address and modifying the class moved the address or something. Just thought I would share.

int[] foo = new int[100000];
java.util.Random rand = new java.util.Random();

for(int a = 0; a < foo.length; a++) foo[a] = rand.nextInt();

int[] bar = new int[100000];
int[] baz = new int[100000];
int[] bax = new int[100000];
for(int a = 0; a < foo.length; a++) bar[a] = baz[a] = bax[a] = foo[a];

System.out.println(foo.hashCode() + " ----- " + bar.hashCode() + " ----- " + baz.hashCode() +  " ----- " + bax.hashCode());

// returns 4097744 ----- 328041 ----- 2083945 ----- 2438296
// Consistently unless you modify the class.  Very weird
// Before adding the comments below it returned this:
// 4177328 ----- 4097744 ----- 328041 ----- 2083945


System.out.println("Equal ?? " +
  (java.util.Arrays.equals(foo, bar) && java.util.Arrays.equals(bar, baz) &&
  java.util.Arrays.equals(baz, bax) && java.util.Arrays.equals(foo, bax)));
+11  A: 

The java.lang.Array hashCode function is inherited from Object so yes it depends on the reference. to get the hashCode based on the content of the array use Arrays.hashCode beware though its a shallow hashCode implementation. a deep implementation is also present Arrays.deepHashCode

MahdeTo
+2  A: 

Arrays use the default hash code, which is based on memory location (but it isn't necessarily the memory location, since it's only an int and all memory addresses won't fit). You can see this by also printing the result of System.identityHashCode(foo).

Arrays are only equal if they are the same, identical array. So, array hash codes will only be equal, generally, if they are the same, identical array.

erickson
(and objects are moved in memory, and if you look at the hash codes they typically don't look like addresses)
Tom Hawtin - tackline
+1  A: 

The default implementation for Object.hashCode() is indeed to return the pointer value of the object, although this is implementation dependent. For instance, a 64-bit JVM may take the pointer and XOR and high and low order words together. Subclasses are encouraged to override this behavior if it makes sense.

However, it does not make sense to perform equality comparisons on mutatable arrays. If an element changes, then the two are no longer equal. To maintain the invariant that the same array will always return the same hashCode no matter what happens to its elements, arrays do not override the default hashcode behavior.

Note that java.util.Arrays provides a deepHashCode() implementation for when hashing based on the contents of the array, rather than the identity of the array itself, is important.

James
Modern VMs move objects around in memory. A current address may be used as a seed, but the result needs to be stored.
Tom Hawtin - tackline
Moving around in memory still does not cause the hashCode to be changed.
Icarus
A: 

I agree with using java.util.Arrays.hashCode (or the google guava generic wrapper Objects.hashcode) but be aware that this can cause issues if you are using Terracotta - see this link

Carl Pritchett