tags:

views:

155

answers:

2

Being used to the old ways of duck typing in Python, I failed to understand the need for ABC (abstract base classes). The help is good on how to use them.

I tried to read the rationale in the PEP, but it went over my head. If I was looking for a mutable sequence container, I would check for __setitem__, or more likely tried to use it (EAFP). I haven't come across a real life use for the numbers module, which does use ABCs, but that is the closest I have to understanding.

Can anyone explain to me the rationale, please?

+2  A: 

It will make determining whether an object supports a given protocol without having to check for presence of all the methods in the protocol or without triggering an exception deep in "enemy" territory due to non-support much easier.

Ignacio Vazquez-Abrams
+7  A: 

Short version: ABCs offer a higher level of semantic contract between clients and the implemented classes.

Long version:

There is a contract between a class and its callers. The class promises to do certain things and have certain properties.

There are different levels to the contract.

At a very low level, the contract might include the name of a method or its number of parameters.

In a staticly-typed language, that contract would actually be enforced by the compiler. In Python, you can use EAFP or introspection to confirm that the unknown object meets this expected contract.

But there are also higher-level, semantic promises in the contract.

For example, if there is a __str__() method, it is expected to return a string representation of the object. It could delete all contents of the object, commit the transaction and spit a blank page out of the printer... but there is a common understanding of what it should do, described in the Python manual.

That's a special case, where the semantic contract is described in the manual. What should the print() method do? Should it write the object to a printer or a line to the screen, or something else? It depends - you need to read the comments to understand the full contract here. A piece of client code that simply checks that the print() method exists has confirmed part of the contract - that a method call can be made, but not that there is agreement on the higher level semantics of the call.

Defining an Abstract Base Class (ABC) is a way of producing a contract between the class implementers and the callers. It isn't just a list of method names, but a shared understanding of what those methods should do. If you inherit from this ABC, you are promising to follow all the rules described in the comments, including the semantics of the print() method.

Python's duck-typing has many advantages in flexibility over static-typing, but it doesn't solve all the problems. ABCs offer an intermediate solution between the free-form of Python and the bondage-and-discipline of a staticly-typed language.

Oddthinking
I think that you have a point there, but I can't follow you. So what is the difference, in terms of contract, between a class that implements `__contains__` and a class that inherits from `collections.Container`? In your example, in Python there was always a shared understanding of `__str__`. Implementing `__str__` makes the same promises as inheriting from some ABC and then implementing `__str__`. In both cases you can break the contract; there are no provable semantics such as the ones we have in static typing.
Muhammad Alkarouri
`collections.Container` is a degenerate case, that only includes `\_\_contains\_\_`, and only to mean the predefined convention. Using an ABC doesn't add much value by itself, I agree. I suspect it was added to allow (for e.g.) `Set` to inherit from it. By the time you get to `Set`, suddenly belonging to the the ABC has considerable semantics. An item can't belong to the collection twice. That is NOT detectable by the existence of methods.
Oddthinking
The `Set` example looks better to me than the `print()` function. In particular, I didn't understand the higher level contract of the `print()` method. It looks to me similar to the `__contains__` and `__str__` cases, where the semantics of these are documented in their respective help/comments. But when you have a complex object like `Set`, the semantics are not the semantics of any one function or property of the object. Am I missing something?
Muhammad Alkarouri
Yes, I think `Set` is a better example than `print()`. I was attempting to find a method name whose meaning was ambiguous, and couldn't be grokked by the name alone, so you couldn't be sure that it would do the right thing just by its name and the Python manual.
Oddthinking