views:

188

answers:

3

I defined a class:

class A:
    ''' hash test class
    >>> a = A(9, 1196833379, 1, 1773396906)
    >>> hash(a)
    -340004569

    This is weird, 12544897317L expected.
    '''
    def __init__(self, a, b, c, d):
        self.a = a
        self.b = b
        self.c = c
        self.d = d

    def __hash__(self):
        return self.a * self.b + self.c * self.d

Why, in the doctest, hash() function gives a negative integer?

+8  A: 

It appears to be limited to 32-bits. By reading this question, it looks like your code might have produced the expected result on a 64-bit machine (with those particular values, since the result fits in 64 bits).

The results of the built-in hash function are platform dependent and constrained to the native word size. If you need a deterministic, cross-platform hash, consider using the hashlib module.

FogleBird
+3  A: 

Because the purpose of a hash function is to take a set of inputs and distribute them across a range of keys, there is no reason that those keys have to be positive integers.

The fact that pythons hash function returns negative integers is just an implementation detail and is necessarily limited to long ints. For example hash('abc') is negative on my system.

mikerobi
+5  A: 

See object.__hash__

Notice that

Changed in version 2.5: __hash__() may now also return a long integer object; the 32-bit integer is then derived from the hash of that object.

In your case, expected 12544897317L is a long integer object,

Python derived the 32-bit integer -340004569 by (12544897317 & 0xFFFFFFFF) - (1<<32)

Python derived the 32-bit integer by hash(12544897317L), which results -340004569

The algorithm is something like this:

def s32(x):
    x = x & ((1<<32)-1)
    if x & (1<<31):
        return x - (1<<32)
    else:
        return x

def hash(x):
    h = 0
    while x:
        h += s32(x)
        x >>= 32
    return h
forgot
Nitpick: (12544897317 it's -340004571. Python actually gets to the 32-bit number by *re-hashing*; i.e., computing hash(12544897317). This is better because it doesn't just throw away the high-order bits of the original hash value, but mixes them into the final hash value instead.
Mark Dickinson
@Mark Dickinson, uh-huh, thank you
forgot