views:

1009

answers:

4

Typically the default implementation of Object.hashCode() is some function of the allocated address of the object in memory (though this is not mandated by the JLS). Given that the VM shunts objects about in memory, why does the value returned by System.identityHashCode() never change during the object's lifetime?

If it is a "one-shot" calculation (the object's hashCode is calculated once and stashed in the object header or something), then does that mean it is possible for two objects to have the same identityHashCode (if they happen to be first allocated at the same address in memory)?

A: 

As far as I know, this is implemented to return the reference, that will never change in a objects lifetime .

Mnementh
So you are saying that the reference is not a real memory address (or directly derived from that). So is it a sort of a pointer to the real memory address?
Thilo
+6  A: 

In answer to the second question, irrespective of the implementation, it is possible for multiple objects to have the same identityHashCode.

See bug 6321873 for a brief discussion on the wording in the javadoc, and a program to demonstrate non-uniqueness.

Stephen Denne
True. Two different objects can have the same hashCode. That is the case with all hash functions (over a domain bigger then their result size).
Thilo
It's a very good bug report that. :)
Tom Hawtin - tackline
+6  A: 

Modern JVMs save the value in the object header. I believe the value is typically calculated only on first use in order to keep object allocation to a minimum (sometimes down to as low as a dozen cycles). The common Sun JVM can be compiled so that the identity hash code is always 1 for all objects.

Multiple objects can have the same identity hash code. That is the nature of hash codes.

Tom Hawtin - tackline
Right - I've just looked thru ObjectSynchronizer::FastHashCode in synchronizer.cpp (vm runtime source code) and after generating the hashcode, it looks like it merges it into the object header.Looks like there are several possible implementations of HashCode; the one you allude to that returns 1 for all objects is used to ensure no part of the VM assumes hashcodes are unique for any reason.
butterchicken
A: 

The general guideline for implementing a hashing function is :

  • the same object should return a consistent hashCode, it should not change with time or depend on any variable information (e.g. an algorithm seeded by a random number or values of mutable member fields
  • the hash function should have a good random distribution, and by that I mean if you consider the hashcode as buckets, 2 objects should map to different buckets (hashcodes) as far as possible. The possibility that 2 objects would have the same hashcode should be rare - although it can happen.
Gishu