ansaurus

Question

Strange Python set and hash behaviour - how does this work?

Answer 1

+8 A:

You have a hash collision. On hash collision, the set uses the == operator to check on whether or not they are truly equal to each other.

Noctis Skytower 2010-01-29 01:06:43

Ah, that makes sense. Would it be better practice to simply always return 0 as the hash to force set to use the == operator? I can't see it introducing much more overhead than the hash function itself would otherwise.

ezod 2010-01-29 01:10:56

Ensuring that every single object will have a hash collision is guaranteed to kill your performance. Hashes are used to file objects inside the set.

jleedev 2010-01-29 01:14:31

@jleedev: Noted, thanks. Using a somewhat better hash function now.

ezod 2010-01-29 01:15:54

Answer 2

+3 A:

It's important to understand how hash and == work together, because both are used by sets. For two values x and y, the important rule is that:

x == y ==> hash(x) == hash(y)

(x equals y implies that x and y's hashes are equal). But, the inverse is not true: two unequal values can have the same hash.

Sets (and dicts) will use the hash to get an approximation to equality, but will use the real equality operation to figure out if two values are the same or not.

Ned Batchelder 2010-01-29 01:12:05

Answer 3

+2 A:

You should always define both __eq__() and __hash__() if you need at least one of them. If the hashes of two objects are equal, an extra __eq__() check is done to verify uniqueness.

Max Shawabkeh 2010-01-29 01:12:34

ansaurus

tags:

views:

answers:

Strange Python set and hash behaviour - how does this work?

related questions