tags:

views:

79

answers:

1

I'm trying to work with a set of django models in an external script. I'm querying the database at a preset interval to get the items that meet the query. When my external script is processing them, it takes a while and I may get the same results in the query if the processing hasn't updated the model yet. I figured I could use a set or list to store the items processing and check each model from the query result to ensure it isn't currently processing. When trying this though, it seems the in keyword always returns True. Any thoughts?

(Python 2.6 on ubuntu 10.10)

>>> t
<SomeDjangoModel: Title1>
>>> v
<SomeDjangoModel: Title2>
>>> x
<SomeDjangoModel: Title3>
>>> items
set([<SomeDjangoModel: Title3>, <SomeDjangoModel: Title1>])
>>> t in items
True
>>> x in items
True
>>> v in items
True
>>> items
set([<SomeDjangoModel: Title3>, <SomeDjangoModel: Title1>])
>>>
+1  A: 

Python sets require that objects implement __eq__ and __hash__ appropriately.

I looked at django.db.models.base.Model (link) and saw that it defines these methods in terms of the model's PK:

    def __eq__(self, other):
        return isinstance(other, self.__class__) and self._get_pk_val() == other._get_pk_val()

    def __ne__(self, other):
        return not self.__eq__(other)

    def __hash__(self):
        return hash(self._get_pk_val())

So it's not a surprise that seemingly distinct objects are considered "equal", it is because they have their PKs initialized to some default value (e.g. None).

Pavel Repin