ansaurus

Question

Rules of thumb for when to use operator overloading in python

Answer 1

+4 A:

I've written software with significant amounts of overloading, and lately I regret that policy. I would say this:

Only overload operators if it's the natural, expected thing to do and doesn't have any side effects.

So if you make a new RomanNumeral class, it makes sense to overload addition and subtraction etc. But don't overload it unless it's natural: it makes no sense to define addition and subtraction for a Car or a Vehicle object.

Another rule of thumb: don't overload ==. It makes it very hard (though not impossible) to actually test if two objects are the same. I made this mistake and paid for it for a long time.

As for when to overload +=, ++ etc, I'd actually say: only overload additional operators if you have a lot of demand for that functionality. It's easier to have one way to do something than five. Sure, it means sometimes you'll have to write x = x + 1 instead of x += 1, but more code is ok if it's clearer.

In general, like with many 'fancy' features, it's easy to think that you want something when you don't really, implement a bunch of stuff, not notice the side effects, and then figure it out later. Err on the conservative side.

EDIT: I wanted to add an explanatory note about overloading ==, because it seems various commenters misunderstand this, and it's caught me out. Yes, is exists, but it's a different operation. Say I have an object x, which is either from my custom class, or is an integer. I want to see if x is the number 500. But if you set x = 500, then later test x is 500, you will get False, due to the way Python caches numbers. With 50, it would return True. But you can't use is, because you might want x == 500 to return True if x is an instance of your class. Confusing? Definitely. But this is the kind of detail you need to understand to successfully overload operators.

Peter 2009-10-12 01:04:38

Overloading `++` doesn't particularly apply, since Python doesn't have a `++` operator.

Chris Lutz 2009-10-12 01:07:11

sure, I'll change the example for Python. (though I meant to explain the general principle).

Peter 2009-10-12 01:08:25

Can't you test if two objects are the same by `if a is b: ...`, even if `==` is overloaded? Or am I misunderstanding the point you're making?

sth 2009-10-12 01:20:50

There is nothing wrong with overloading `==` / `__eq__`, it's probably one of the most-overloaded methods in fact. Python has `is` to check for object identity.

THC4k 2009-10-12 01:23:15

well only sort of. for example, in most implementations of Python, `x = 1<newline>x is 1` will be true, but `x = 500<newline> x is 500` will not. it's this sort of thing that gets very confusing, very fas

Peter 2009-10-12 01:23:44

you can just use object.__eq__(a,b) instead. but it makes sense to overload == with something that means equality for your class anyway

gnibbler 2009-10-12 01:26:03

actually, using `__eq__` is just the same as using `==`.

Peter 2009-10-12 01:31:29

`x is x` *should* return `False` for all numbers, but in cpython it doesn't, because the compiler optimizes some constants. But that is not part of the language. Code that tests for something that *should* always return `False` is just bad to begin with, so don't blame it on Python.

THC4k 2009-10-12 01:35:39

well, just as long as you're not trying to run code that *should* work but doesn't because of mere implementation issues.

Peter 2009-10-12 02:02:34

Note that in your updated text you've confused "==" with "is" which is confusing. x = 500; x == 500; should always be true :).

James Antill 2009-10-12 07:11:03

argh, you're right. thanks :) fixed now

Peter 2009-10-12 07:23:52

Answer 2

+2 A:

Python's overloading is "safer" in general than C++'s -- for example, the assignment operator can't be overloaded, and += has a sensible default implementation.

In some ways, though, overloading in Python is still as "broken" as in C++. Programmers should restrain the desire to "re-use" an operator for unrelated purposes, such as C++ re-using the bitshifts to perform string formatting and parsing. Don't overload an operator with different semantics from your implementation just to get prettier syntax.

Modern Python style strongly discourages "rogue" overloading, but many aspects of the language and standard library retain poorly-named operators for backwards compatibility. For example:

%: modulus and string formatting
+: addition and sequence concatenation
*: multiplication and sequence repetition

So, rule of thumb? If your operator implementation will surprise people, don't do it.

John Millikin 2009-10-12 01:15:43

I think the dual uses of `+` and `*` are merited - both uses at least do the same conceptual thing, even if they do it differently.

Chris Lutz 2009-10-12 01:24:49

They're not at all the same. `(1+2)==(2+1)`, but `("a"+"b")!=("b"+"a")`. `(1+2-1)==2`, but `("a"+"b"-"a")` is nonsense. The same sort of issues exist for multiplication.

John Millikin 2009-10-12 01:35:33

I didn't mean to say they were the same, but I did word that badly. I meant that conceptually they perform similar actions. I think most people would say that joining two strings and adding two numbers as both "additive" operations, and that multiplying numbers and repeating a string several times are both "multiplicative" operations.

Chris Lutz 2009-10-12 01:45:29

the only time I've encountered * overloaded in Python, it was surprising and ambiguous ( someone was asking why multiplication of 500x500 matrices was faster in Python than Java - the answer was the * operator was doing array multiplication not matrix multiplication )

Pete Kirkham 2009-12-20 16:09:41

Answer 3

+1 A:

Here is an example that uses the bitwise or operation to simulate a unix pipeline. This is intended as a counter example to most of the rules of thumb.

I just found Lumberjack which uses this syntax in real code



class pipely(object):
    def __init__(self, *args, **kw):
        self._args = args
        self.__dict__.update(kw)

    def __ror__(self, other):
        return ( self.map(x) for x in other if self.filter(x) )

    def map(self, x):
        return x

    def filter(self, x):
        return True

class sieve(pipely):
    def filter(self, x):
        n = self._args[0]
        return x==n or x%n

class strify(pipely):
    def map(self, x):
        return str(x)

class startswith(pipely):
    def filter(self, x):
        n=str(self._args[0])
        if x.startswith(n):
            return x

print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | strify() | startswith(5):
    print i

print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | pipely(map=str) | startswith(5):
    print i

print"*"*80
for i in xrange(2,100) | sieve(2) | sieve(3) | sieve(5) | sieve(7) | pipely(map=str) | pipely(filter=lambda x: x.startswith('5')):
    print i

gnibbler 2009-10-12 01:21:40

+1 because this is very interesting, even though I don't know if I approve of using this in real code.

Chris Lutz 2009-10-12 01:40:18

:) I admin I haven't used it in real code, but it's a handy way to chain generators together. You can do something similar with co-routines, but the syntax becomes more like `sieve(2,sieve(3,sieve(5,sieve(7))))` which I dislike more

gnibbler 2009-10-12 01:52:20

Answer 4

+8 A:

Operator overloading is mostly useful when you're making a new class that falls into an existing "Abstract Base Class" (ABC) -- indeed, many of the ABCs in standard library module collections rely on the presence of certain special methods (and special methods, one with names starting and ending with double underscores AKA "dunders", are exactly the way you perform operator overloading in Python). This provides good starting guidance.

For example, a Container class must override special method __contains__, i.e., the membership check operator item in container (as in, if item in container: -- don't confuse with the for statement, for item in container:, which relies on __iter__!-). Similarly, a Hashable must override __hash__, a Sized must override __len__, a Sequence or a Mapping must override __getitem__, and so forth. (Moreover, the ABCs can provide your class with mixin functionality -- e.g., both Sequence and Mapping can provide __contains__ on the basis of your supplied __getitem__ override, and thereby automatically make your class a Container).

Beyond the collections, you'll want to override special methods (i.e. provide for operator overloading) mostly if your new class "is a number". Other special cases exist, but resist the temptation of overloading operators "just for coolness", with no semantic connection to the "normal" meanings, as C++'s streams do for << and >> and Python strings (in Python 2.*, fortunately not in 3.* any more;-) do for % -- when such operators do not any more mean "bit-shifting" or "division remainder", you're just engendering confusion. A language's standard library can get away with it (though it shouldn't;-), but unless your library gets as widespread as the language's standard one, the confusion will hurt!-)

Alex Martelli 2009-10-12 01:23:07

BTW, for those despairing at the thought of not having % for string formatting: although the Python 3 documentation describes % as obsolete, it is still documented and there seems no chance that the feature will truly go away until Python 4, based on recent discussions in python-dev. That leaves plenty of time to learn and love the new string format method already available in 2.6.

Ned Deily 2009-10-12 04:40:37

The format function is much better than % ever was

Casebash 2009-10-31 09:45:22

ansaurus

tags:

views:

answers:

Rules of thumb for when to use operator overloading in python

related questions