views:

366

answers:

4

The Python docs clearly state that x==y calls x.__eq__(y). However it seems that under many circumstances, the opposite is true. Where is it documented when or why this happens, and how can I work out for sure whether my object's __cmp__ or __eq__ methods are going to get called.

Edit: Just to clarify, I know that __eq__ is called in preferecne to __cmp__, but I'm not clear why y.__eq__(x) is called in preference to x.__eq__(y), when the latter is what the docs state will happen.

>>> class TestCmp(object):
...     def __cmp__(self, other):
...         print "__cmp__ got called"
...         return 0
... 
>>> class TestEq(object):
...     def __eq__(self, other):
...         print "__eq__ got called"
...         return True
... 
>>> tc = TestCmp()
>>> te = TestEq()
>>> 
>>> 1 == tc
__cmp__ got called
True
>>> tc == 1
__cmp__ got called
True
>>> 
>>> 1 == te
__eq__ got called
True
>>> te == 1
__eq__ got called
True
>>> 
>>> class TestStrCmp(str):
...     def __new__(cls, value):
...         return str.__new__(cls, value)
...     
...     def __cmp__(self, other):
...         print "__cmp__ got called"
...         return 0
... 
>>> class TestStrEq(str):
...     def __new__(cls, value):
...         return str.__new__(cls, value)
...     
...     def __eq__(self, other):
...         print "__eq__ got called"
...         return True
... 
>>> tsc = TestStrCmp("a")
>>> tse = TestStrEq("a")
>>> 
>>> "b" == tsc
False
>>> tsc == "b"
False
>>> 
>>> "b" == tse
__eq__ got called
True
>>> tse == "b"
__eq__ got called
True

Edit: From Mark Dickinson's answer and comment it would appear that:

  1. Rich comparison overrides __cmp__
  2. __eq__ is it's own __rop__ to it's __op__ (and similar for __lt__, __ge__, etc)
  3. If the left object is a builtin or new-style class, and the right is a subclass of it, the right object's __rop__ is tried before the left object's __op__

This explains the behaviour in theTestStrCmp examples. TestStrCmp is a subclass of str but doesn't implement its own __eq__ so the __eq__ of str takes precedence in both cases (ie tsc == "b" calls b.__eq__(tsc) as an __rop__ because of rule 1).

In the TestStrEq examples, tse.__eq__ is called in both instances because TestStrEq is a subclass of str and so it is called in preference.

In the TestEq examples, TestEq implements __eq__ and int doesn't so __eq__ gets called both times (rule 1).

But I still don't understand the very first example with TestCmp. tc is not a subclass on int so AFAICT 1.__cmp__(tc) should be called, but isn't.

+1  A: 

Is this not documented in the Language Reference? Just from a quick look there, it looks like __cmp__ is ignored when __eq__, __lt__, etc are defined. I'm understanding that to include the case where __eq__ is defined on a parent class. str.__eq__ is already defined so __cmp__ on its subclasses will be ignored. object.__eq__ etc are not defined so __cmp__ on its subclasses will be honored.

In response to the clarified question:

I know that __eq__ is called in preferecne to __cmp__, but I'm not clear why y.__eq__(x) is called in preference to x.__eq__(y), when the latter is what the docs state will happen.

Docs say x.__eq__(y) will be called first, but it has the option to return NotImplemented in which case y.__eq__(x) is called. I'm not sure why you're confident something different is going on here.

Which case are you specifically puzzled about? I'm understanding you just to be puzzled about the "b" == tsc and tsc == "b" cases, correct? In either case, str.__eq__(onething, otherthing) is being called. Since you don't override the __eq__ method in TestStrCmp, eventually you're just relying on the base string method and it's saying the objects aren't equal.

Without knowing the implementation details of str.__eq__, I don't know whether ("b").__eq__(tsc) will return NotImplemented and give tsc a chance to handle the equality test. But even if it did, the way you have TestStrCmp defined, you're still going to get a false result.

So it's not clear what you're seeing here that's unexpected.

Perhaps what's happening is that Python is preferring __eq__ to __cmp__ if it's defined on either of the objects being compared, whereas you were expecting __cmp__ on the leftmost object to have priority over __eq__ on the righthand object. Is that it?

profjim
Having played with this a bit more, I think you are right that `__eq__` is preferred on either object in these cases.
Singletoned
+1  A: 

As I know, __eq__() is a so-called “rich comparison” method, and is called for comparison operators in preference to __cmp__() below. __cmp__() is called if "rich comparison" is not defined.

So in A == B:
If __eq__() is defined in A it will be called
Else __cmp__() will be called

__eq__() defined in 'str' so your __cmp__() function was not called.

The same rule is for __ne__(), __gt__(), __ge__(), __lt__() and __le__() "rich comparison" methods.

Mihail
+6  A: 

Actually, in the docs, it states:

[__cmp__ is c]alled by comparison operations if rich comparison (see above) is not defined.

__eq__ is a rich comparison method and, in the case of TestCmp, is not defined, hence the calling of __cmp__

Dancrumb
But `str.__eq__` is defined, so presumably `TestStrCmp.__eq__` is defined (inherited).
profjim
You are correct. I've made the appropriate edit... thanks
Dancrumb
You are right that `__eq__` overrides `__cmp__`, but that was not the surprising behaviour. The surprise was that it calls it on the right object not the left one. (I've updated the question to clarify this a bit).
Singletoned
+6  A: 

You're missing a key exception to the usual behaviour: when the right-hand operand is an instance of a subclass of the class of the left-hand operand, the special method for the right-hand operand is called first.

See the documentation at:

http://docs.python.org/reference/datamodel.html#coercion-rules

and in particular, the following two paragraphs:

For objects x and y, first x.__op__(y) is tried. If this is not implemented or returns NotImplemented, y.__rop__(x) is tried. If this is also not implemented or returns NotImplemented, a TypeError exception is raised. But see the following exception:

Exception to the previous item: if the left operand is an instance of a built-in type or a new-style class, and the right operand is an instance of a proper subclass of that type or class and overrides the base’s __rop__() method, the right operand’s __rop__() method is tried before the left operand’s __op__() method.

Mark Dickinson
@Daniel Pryden: Thanks for the formatting fixes! I'll try to remember blockquote next time.
Mark Dickinson
Nice one, however I thought, (but am not sure), that all `__rop__` methods were deprecated. Also I'm not using any of them.
Singletoned
Agreed that you're not using any `__rop__` methods. The comparison methods are special in this respect: `__eq__` is its own reverse, so read `__eq__` for both `__op__` and `__rop__`. (Similarly, `__ne__` is its own reverse, `__le__` is the reverse of `__ge__`, etc.)Others have commented before (correctly, IMO) that the documentation could use some work here.I'm almost certain that the `__rop__` methods aren't deprecated!
Mark Dickinson
If you're right then that's the answer I would accept. Have you got any evidence that this is true?
Singletoned
See http://docs.python.org/reference/datamodel.html#object.__lt__. The fourth paragraph starts: "There are no swapped-argument versions of these methods [...] rather, `__lt__()` and `__gt__()` are each other’s reflection, `__le__()` and `__ge__()` are each other’s reflection, and `__eq__()` and `__ne__()` are their own reflection." Or were you asking for evidence that the `__rop__` methods aren't deprecated?
Mark Dickinson
You're right, and that definitely could do with clarifying. I've read that page several times now, and it hadn't sunk in that it's saying that `__lt__` is the `__rop__` of `__gt__`. Also isn't that wrong? Shouldn't `__lt__` be the `__rop__` of `__ge__`? IE `(2<2)==(not(2>=2))`
Singletoned
Also we still haven't quite explained the very first example: `1 == tc`. The only explanation I can see ATM for that is that a subclass of `object` counts as a subclass of any builtin for the purposes of operator overloading.
Singletoned
No, I think it's right: `__lt__` is the `__rop__` of `__gt__`. There's no logical negation going on; just a reversal of the arguments. The translation is: `x.__lt__(y)` <=> `x < y` <=> `y > x` <=> `y.__gt__(x)`.
Mark Dickinson
The rules for `__cmp__` (especially in combination with rich comparisons) are truly hairy, and I don't think they're even properly documented anywhere, except in the source. (PyObject_RichCompare in the Objects/object.c file is the main place to look, if you feel inclined.) In the `1 == tc` example, neither side implements `__eq__`, so we fall back on `__cmp__`. `int.__cmp__` can only handle comparisons with other integers, so it's ignored (not even called), and your `TestCmp.__cmp__` method gets called instead.Getting rid of `__cmp__` in py3k was a Very Good Thing. :)
Mark Dickinson
Well, I've awarded you the answer for sheer depth of answering. I guess it appears the real answer to my question is "no one really knows". Let's hope it works out a lot clearer in Py3k.
Singletoned