views:

352

answers:

4

Hi!

i am a python newbie, and i am not sure why python implemented len(obj), max(obj), and min(obj) as a static like functions (i am from the java language) over obj.len(), obj.max(), and obj.min()

what are the advantages and disadvantages (other than obvious inconsistency) of having len()... over the method calls?

why guido chose this over the method calls? (this could have been solved in python3 if needed, but it wasn't changed in python3, so there gotta be good reasons...i hope)

thanks!!

+2  A: 

It emphasizes the capabilities of an object, not its methods or type. Capabilites are declared by "helper" functions such as __iter__ and __len__ but they don't make up the interface. The interface is in the builtin functions, and beside this also in the buit-in operators like + and [] for indexing and slicing.

Sometimes, it is not a one-to-one correspondance: For example, iter(obj) returns an iterator for an object, and will work even if __iter__ is not defined. If not defined, it goes on to look if the object defines __getitem__ and will return an iterator accessing the object index-wise (like an array).

This goes together with Python's Duck Typing, we care only about what we can do with an object, not that it is of a particular type.

kaizer.se
+3  A: 

Actually, those aren't "static" methods in the way you are thinking about them. They are built-in functions that really just alias to certain methods on python objects that implement them.

>>> class Foo(object):
...     def __len__(self):
...             return 42
... 
>>> f = Foo()
>>> len(f)
42

These are always available to be called whether or not the object implements thmm or not. The point is to have some consistency. Instead of some class having a method called length() and another called size(), the convention is to implement len and let the callers always access it by the more readable len(obj) instead of obj.methodThatDoesSomethingCommon

whaley
No, that's just len. There are no `__min__` or `__max__` special methods.
Glenn Maynard
The point is that the question treats them all alike, which is incorrect. This answer says -- correctly -- that this isn't general pattern, but there's a mixture of things with a common notation.
S.Lott
Even `f.__len__()` isn't quite an alias for `len(f)`. If your `__len__` special method returns anything non integer (or even a large integer) then the `len` function will fail with a `TypeError`.
Scott Griffiths
+13  A: 

The big advantage is that built-in functions (and operators) can apply extra logic when appropriate, beyond simply calling the special methods. For example, min can look at several arguments and apply the appropriate inequality checks, or it can accept a single iterable argument and proceed similarly; abs when called on an object without a special method __abs__ could try comparing said object with 0 and using the object change sign method if needed (though it currently doesn't); and so forth.

So, for consistency, all operations with wide applicability must always go through built-ins and/or operators, and it's those built-ins responsibility to look up and apply the appropriate special methods (on one or more of the arguments), use alternate logic where applicable, and so forth.

An example where this principle wasn't correctly applied (but the inconsistency was fixed in Python 3) is "step an iterator forward": in 2.5 and earlier, you needed to define and call the non-specially-named next method on the iterator. In 2.6 and later you can do it the right way: the iterator object defines __next__, the new next built-in can call it and apply extra logic, for example to supply a default value (in 2.6 you can still do it the bad old way, for backwards compatibility, though in 3.* you can't any more).

Another example: consider the expression x + y. In a traditional object-oriented language (able to dispatch only on the type of the leftmost argument -- like Python, Ruby, Java, C++, C#, &c) if x is of some built-in type and y is of your own fancy new type, you're sadly out of luck if the language insists on delegating all the logic to the method of type(x) that implements addition (assuming the language allows operator overloading;-).

In Python, the + operator (and similarly of course the builtin operator.add, if that's what you prefer) tries x's type's __add__, and if that one doesn't know what to do with y, then tries y's type's __radd__. So you can define your types that know how to add themselves to integers, floats, complex, etc etc, as well as ones that know how to add such built-in numeric types to themselves (i.e., you can code it so that x + y and y + x both work fine, when y is an instance of your fancy new type and x is an instance of some builtin numeric type).

"Generic functions" (as in PEAK) are a more elegant approach (allowing any overriding based on a combination of types, never with the crazy monomaniac focus on the leftmost arguments that OOP encourages!-), but (a) they were unfortunately not accepted for Python 3, and (b) they do of course require the generic function to be expressed as free-standing (it would be absolutely crazy to have to consider the function as "belonging" to any single type, where the whole POINT is that can be differently overridden/overloaded based on arbitrary combination of its several arguments' types!-). Anybody who's ever programmed in Common Lisp, Dylan, or PEAK, knows what I'm talking about;-).

So, free-standing functions and operators are just THE right, consistent way to go (even though the lack of generic functions, in bare-bones Python, does remove some fraction of the inherent elegance, it's still a reasonable mix of elegance and practicality!-).

Alex Martelli
+1 for a great explanation to the benefits of built-ins.
whaley
You say "able to dispatch only on the type of the leftmost argument -- like Python, C++, ..." and then show that python can also dispatch on the right argument with `__radd__`. Also, C++ can dispatch on both arguments if you declare `operator+` as a free function. This seems to be a little inconsistent...
sth
@sth, Python can't _dispatch_ on the RHS argument -- it can _use a builtin or operator_ to do the grunt work. In C++, if `operator+` is a free-standing function, it can be overloaded based on the _statically visible_ (aka compile-time visible) types of the operands, it just **CANNOT** **DISPATCH**, i.e., use strictly the **runtime** types of the operands to pick what code to execute (as a member function can, but based only on the **LEFT HAND SIDE** type). Nothing inconsistent, for readers who knows what dispatch means, and the difference between compile-time and run-time types!-)
Alex Martelli
@whaley, thanks, glad you liked it!
Alex Martelli
Good explanation, but it's not applicable to `len()` - the most questionable design decision.
Denis Otkidach
@Denis, uniformity is a language design strength: it would be totally absurd if, for each operation applicable to a vast category of objects, you had to memorize "well let's see, is THIS one a builtin function like most, or is it one of the exceptions" (`next` in 2.5 and earlier was the wrong design decision, as I mentioned, definitely **NOT** `len`). "No exceptions to this design rule" is clearly _the_ design decision here, not dozens of disparate ones, one per each tidbit of functionality!
Alex Martelli
uilt-in functions vs inherited methods. --- for len(), i think leng() could have been better as an inherited method, but after reading the google's go language faq, maybe, the decision was based on the easy of implementation ------> from golang.org ------> We debated this issue but decided implementing len and friends as functions was fine in practice and didn't complicate questions about the interface (in the Go type sense) of basic types –
+1  A: 

I thought the reason was so these basic operations could be done on iterators with the same interface as containers. However, it actually doesn't work with len:

def foo():
    for i in range(10):
        yield i
print len(foo())

... fails with TypeError. len() won't consume and count an iterator; it only works with objects that have a __len__ call.

So, as far as I'm concerned, len() shouldn't exist. It's much more natural to say obj.len than len(obj), and much more consistent with the rest of the language and the standard library. We don't say append(lst, 1); we say lst.append(1). Having a separate global method for length is an odd, inconsistent special case, and eats a very obvious name in the global namespace, which is a very bad habit of Python.

This is unrelated to duck typing; you can say getattr(obj, "len") to decide whether you can use len on an object just as easily--and much more consistently--than you can use getattr(obj, "__len__").

All that said, as language warts go--for those who consider this a wart--this is a very easy one to live with.

On the other hand, min and max do work on iterators, which gives them a use apart from any particular object. This is straightforward, so I'll just give an example:

import random
def foo():
    for i in range(10):
        yield random.randint(0, 100)
print max(foo())

However, there are no __min__ or __max__ methods to override its behavior, so there's no consistent way to provide efficient searching for sorted containers. If a container is sorted on the same key that you're searching, min/max are O(1) operations instead of O(n), and the only way to expose that is by a different, inconsistent method. (This could be fixed in the language relatively easily, of course.)

To follow up with another issue with this: it prevents use of Python's method binding. As a simple, contrived example, you can do this to supply a function to add values to a list:

def add(f):
    f(1)
    f(2)
    f(3)
lst = []
add(lst.append)
print lst

and this works on all member functions. You can't do that with min, max or len, though, since they're not methods of the object they operate on. Instead, you have to resort to functools.partial, a clumsy second-class workaround common in other languages.

Of course, this is an uncommon case; but it's the uncommon cases that tell us about a language's consistency.

Glenn Maynard