views:

107

answers:

2

I've written a class whose .__hash__() implementation takes a long time to execute. I've been thinking to cache its hash, and store it in a variable like ._hash so the .__hash__() method would simply return ._hash. (Which will be computed either at the end of the .__init__() or the first time .__hash__() is called.)

My reasoning was: "This object is immutable -> Its hash will never change -> I can cache the hash."

But now that got me thinking: You can say the same thing about any hashable object. (With the exception of objects whose hash is their id.)

So is there ever a reason not to cache an object's hash, except for small objects whose hash computation is very fast?

+1  A: 

The usual reason is that most objects in Python are mutable, so if the hash depends on the properties, it changes as soon as you change a property. If your class really is an immutable and (all the properties which go into the hash are immutable, too!), then you can cache the hash.

Aaron Digulla
Of course, if an object is mutable, it's usually a bad idea to implement `__hash__` at all. The only builtin uses of `__hash__` require that the hash be stable.
Thomas Wouters
No, the default implementation of `__hash__` doesn't return anything different when you change a property on a object. Because by default this is true for any object: `hash(obj) == id(obj) == hash(id(obj))` -- this means objects just take their id as their hash. The id is static, so you could say objects "cache" their hash by default.
THC4k
+2  A: 

Sure, it's fine to cache the hash value. In fact, Python does so for strings itself. The trade-off is between the speed of the hash calculation and the space it takes to save the hash value. That trade-off is for example why tuples don't cache their hash value, but strings do.

Thomas Wouters