views:

60

answers:

1

I need to create a 'container' object or class in Python, which keeps a record of other objects which I also define. One requirement of this container is that if two objects are deemed to be identical, one (either one) is removed. My first thought was to use a set([]) as the containing object, to complete this requirement.

However, the set does not remove one of the two identical object instances. What must I define to create one?

Here is the Python code.

class Item(object):
  def __init__(self, foo, bar):
    self.foo = foo
    self.bar = bar
  def __repr__(self):
    return "Item(%s, %s)" % (self.foo, self.bar)
  def __eq__(self, other):
    if isinstance(other, Item):
      return ((self.foo == other.foo) and (self.bar == other.bar))
    else:
      return False
  def __ne__(self, other):
    return (not self.__eq__(other))

Interpreter

>>> set([Item(1,2), Item(1,2)])
set([Item(1, 2), Item(1, 2)])

It is clear that __eq__(), which is called by x == y, is not the method called by the set. What is called? What other method must I define?

Note: The Items must remain mutable, and can change, so I cannot provide a __hash__() method. If this is the only way of doing it, then I will rewrite for use of immutable Items.

+2  A: 

I am afraid you will have to provide a __hash__() method. But you can code it the way, that it does not depend on the mutable attributes of your Item.

eumiro
<http://docs.python.org/reference/datamodel.html?highlight=__eq__#object.__hash__> In the second paragraph here, it points out that `__hash__()` should only be defined for immutable objects.
Nathanael
@Nathanael: if the object may have to change, you can make an immutable copy of the object, like frozenset() and set().
Lie Ryan
@Nathanael - how would you like to call `__eq__`? Comparing those (1,2) attributes? Then you have to return some hash of (1,2) also in your `__hash__` method.
eumiro
Or, since foo and bar are both immutable, the __hash__() could return the sum of the hashes of foo and bar? No... sum is too unrelible... if foo was 1 and bar 2, that would equal bar as 1 with foo as 2, which is wrong.What mathematical function can be used for this purpose? Modulo or division should work.
Nathanael
Nathanael: Just use the `hash` builtin: `hash((self.foo, self.bar))`. This uses the tuple's hash, which is appropriate for your needs. (Your `__eq__` could also be written in terms of `tuple` comparison.)
Piet Delport
@Nathanael: Piet's proposal with `hash((self.foo, self.bar))` is perfect for your needs.
eumiro
@Nathanael: Hash collision is Ok. python's set will double check with `__eq__` to ensure that objects with the same hashes are really equivalent; however you'd want the least amount collision possible for the best performance of set().
Lie Ryan
Great! Thanks to you all!
Nathanael
If you ever want to do something like this `i = item(1,1); i.foo = 2` then you *can't use a set*! The set won't get notified of any changes to it's contents and that is why `__hash__` must be static in the first place! If you won't change your items, then you should make your Items really immutable.
THC4k
@THC4k Thanks for pointing that out. I had decided to treat them as immutable, because the hashing function is based on the hashes of the attributes, anyway. Now, when I modify them, I use symmetric_difference_update(old_item, new_item), where new item is based on old_item.
Nathanael