views:

193

answers:

3

I've been learning more about Python recently, and as I was going through the excellent Dive into Python the author noted here that the __init__ method is not technically a constructor, even though it generally functions like one.

I have two questions:

  1. What are the differences between how C++ constructs an object and how Python "constructs" an object?

  2. What makes a constructor a constructor, and how does the __init__ method fail to meet this criteria?

+2  A: 

A constructor in many other languages allocates space for the object being constructed; in Python this is the job of the allocator method, __new__(). __init__() is just an initializer method.

Ignacio Vazquez-Abrams
Constructors in C++ don't allocate space for the object, either. That's the job of `operator new`. Or in the degenerate case of placement new, it's in effect the job of the caller to tell that version of new where the memory is, so that it can echo that value back to you.
Steve Jessop
+13  A: 

The distinction that the author draws is that, as far as the Python language is concerned, you have a valid object of the specified type before you even enter __init__. Therefore it's not a "constructor", since in C++ and theoretically, a constructor turns an invalid, pre-constructed object into a "proper" completed object of the type.

Basically __new__ in Python is defined to return "the new object instance", whereas C++ new operators just return some memory, which is not yet an instance of any class.

However, __init__ in Python is probably where you first establish some important class invariants (what attributes it has, just for starters). So as far as the users of your class are concerned, it might as well be a constructor. It's just that the Python runtime doesn't care about any of those invariants. If you like, it has very low standards for what constitutes a constructed object.

I think the author makes a fair point, and it's certainly an interesting remark on the way that Python creates objects. It's quite a fine distinction, though and I doubt that calling __init__ a constructor will ever result in broken code.

Also, I note that the Python documentation refers to __init__ as a constructor (http://docs.python.org/release/2.5.2/ref/customization.html)

As a special constraint on constructors, no value may be returned

... so if there are any practical problems with thinking of __init__ as a constructor, then Python is in trouble!

The way that Python and C++ construct objects have some similarities. Both call a function with a relatively simple responsibility (__new__ for an object instance vs some version of operator new for raw memory), then both call a function which has the opportunity to do more work to initialize the object into a useful state (__init__ vs a constructor).

Practical differences include:

  • in C++, no-arg constructors for base classes are called automatically in the appropriate order if necessary, whereas for __init__ in Python, you have to explicitly init your base in your own __init__. Even in C++, you have to specify the base class constructor if it has arguments.

  • in C++, you have a whole mechanism for what happens when a constructor throws an exception, in terms of calling destructors for sub-objects that have already been constructed. In Python I think the runtime (at most) calls __del__.

Then there's also the difference that __new__ doesn't just allocate memory, it has to return an actual object instance. Then again, raw memory isn't really a concept that applies to Python code.

Steve Jessop
Great explanation, thanks.
Zeke
The runtime doesn't really call `__del__`, it just propagates the exception upward, and after the last reference to the being-initialized object goes away, (eventually) the object is destroyed. `__del__` *may* be called right before that, but it isn't actually used to do any destructing; `__del__` is the counterpart to `__init__`, not to `__new__` :)
Thomas Wouters
@Thomas: if `__del__` is truly the counterpart to `__init__`, then I would expect it to be defined to be called if (and only if) `__init__` completes. Sure, it might free system resources allocated in `__init__`, or for that matter in any other method of the class. My limited understanding of CPython is that `__del__` is called when an object is destroyed regardless of whether `__init__` completed, or even whether it was called at all. And that an object which fails to `__init__` will be destroyed immediately, although that's an implementation detail of CPython's refcounting. Is that right?
Steve Jessop
... all of which qualified with the caveat that `__del__` might not be called at all, if the object is still live at program exit. I don't really want to get too hung up on `__del__`, though. I've never used it, and as far as its *guaranteed* behaviour is concerned it appears to me about as useful as a Java finalizer, i.e. not.
Steve Jessop
`__del__` isn't a *strict* counterpart to `__init__`. It's not tied to `__init__` succesfully executing. It's the counterpart to `__init__` in that it's an optional hook that's called right before object destruction, just like `__init__` is an optional hook called right after object construction. `__del__` isn't called to *do* object destruction, and a `__del__` method can actually prevent object destruction (by storing a reference to `self` somewhere.) And yes, it's really not useful, and it can create un-cleanable reference cycles, which is very bad. Relying Python to do cleanup is better.
Thomas Wouters
+2  A: 

In Python an object is created, by __new__, and that sort of generic default object is modified by __init__. And __init__ is just an ordinary method. In particular it can be called virtually, and calling methods from __init__ calls them virtually.

In C++ raw memory for an object is allocated in some way, statically, or on a call stack, or dynamically via operator new, or as part of another object. Then the constructor for the type that you're instantiating initializes the raw memory to suitable values. A constructor for a given class automatically calls constructors of base classes and members, so construction is guaranteed a "bottom up" construction, making the parts first.

C++ adds language support for two specially important aspects of the idea of construction from parts:

  • If a constructor fails (by throwing an exception), then parts that have been successfully constructed are destroyed, automatically, and memory for the object is deallocated, automatically.
  • During execution of the body of a constructor of a type T the object is of type T, so calls to virtual methods will resolve as if the object is of type T (which it is, at this point), where T can be a base class of the class you instantiated.

The first point means that with a properly designed C++ class, when you have an object at hand it's guaranteed usable as-is. If the construction fails then you simply don't end up with an object at hand.

Also, the rules of C++ are designed to ensure that for every object of most derived class T there is one and only one T constructor call. I used to call it the single constructor call guarantee. It's not specified as such any place in the standard, and you can foil it by using very low level facilities of the language, but it's there, it's what the detailed rules of the standard are designed to accomplish (it's much the same as you won't find any single rule about semicolon-termination of statements, yet all the myriad syntax rules for various statements conspire to yield a simple high level rule).

The single constructor call guarantee, and the automatic cleanup guarantee, and the changing type of an object as constructors of base classes are exectued, are perhaps the three most important differences from a Python object construction.

There's much much more to be said, but I think these are the most important ideas.

Cheers & hth.,

Alf P. Steinbach