ansaurus

Question

Python hashable dicts

Answer 1

+2 A:

Hashables should be immutable -- not enforcing this but TRUSTING you not to mutate a dict after its first use as a key, the following approach would work:

class hashabledict(dict):
  def __key(self):
    return tuple((k,self[k]) for k in sorted(self))
  def __hash__(self):
    return hash(self.__key())
  def __eq__(self, other):
    return self.__key() == other.__key()

If you DO need to mutate your dicts and STILL want to use them as keys, complexity explodes hundredfolds -- not to say it can't be done, but I'll wait until a VERY specific indication before I get into THAT incredible morass!-)

Alex Martelli 2009-07-20 04:18:04

I certainly do not want to mutate the dicts once they have been prepared. That would make the rest of the packrad algorithm fall apart.

TokenMacGuy 2009-07-20 04:21:15

Then the subclass I suggested will work -- note how it bypasses the "positional" issue (_before_ you had edited your question to point it out;-) with the `sorted` in __key;-).

Alex Martelli 2009-07-20 04:44:35

The position dependent behavior of namedtuple surprised the heck out of me. I had been playing with it, thinking it might still be an easier way to solve the problem, but that pretty much dashed all my hopes (and will require a rollback :( )

TokenMacGuy 2009-07-20 18:15:01

Answer 2

+3 A:

Here is the easy way to make a hashable dictionary. Just remember not to mutate them after embedding in another dictionary for obvious reasons.

class hashabledict(dict):
    def __hash__(self):
        return hash(tuple(sorted(self.items())))

Unknown 2009-07-20 04:30:24

This looks promising!

TokenMacGuy 2009-07-20 04:36:25

This does not sharply ensure consistency of __eq__ and __hash__ while my earlier answer does through the use of the __key method (in practice either approach should work, though this one might be slowed down by making an unneeded itermediate list -- fixable by s/items/iteritems/ -- assuming Python 2.* as you don't say;-).

Alex Martelli 2009-07-20 04:48:01

@Alex, yes you can fix it with iteritems. I copied and pasted this solution from a google link. As for ensuring consistency of __hash__, there should be no problems. Equality is also not a problem. hashabledict(a=5) == hashabledict(a=5) is true.

Unknown 2009-07-20 05:01:17

Both solutions seem to be about the same, and this is probably the kernel of how I will solve the problem, so I'm accepting yours since you have a lower rep.

TokenMacGuy 2009-07-20 18:16:06

Answer 3

+1 A:

A reasonably clean, straightforward implementation is

import collections

class FrozenDict(collections.Mapping):
    """Don't forget the docstrings!!"""

    def __init__(self, *args, **kwargs):
        self._d = dict(*args, **kwargs)

    def __iter__(self):
        return iter(self._d)

    def __len__(self):
        return len(self._d)

    def __getitem__(self, key):
        return self._d[key]

    def __hash__(self):
        return hash(tuple(sorted(self._d.iteritems())))

Mike Graham 2010-04-24 18:24:55

Answer 4

+1 A:

You might also want to add these two methods to get the v2 pickling protocol work with hashdict instances. Otherwise cPickle will try to use hashdict._setitem_ resulting in a TypeError. Interestingly, with the other two versions of the protocol your code works just fine.

def __setstate__(self, objstate):
    for k,v in objstate.items():
        dict.__setitem__(self,k,v)
def __reduce__(self):
    return (hashdict, (), dict(self),)

Giovanni 2010-07-01 20:47:00

ansaurus

tags:

views:

answers:

Python hashable dicts

related questions