views:

835

answers:

10

Hello everyone,

my question is rather a design question. In Python, if code in your "constructor" fails, the object ends up not being defined. Thus:

someInstance = MyClass("test123") #lets say that constructor throws an exception
someInstance.doSomething() # will fail, name someInstance not defined.

I do have a situation though, where a lot of code copying would occur if i remove the error-prone code from my constructor. Basically my constructor fills a few attributes (via IO, where a lot can go wrong) that can be accessed with various getters. If I remove the code from the contructor, i'd have 10 getters with copy paste code something like :

  1. is attribute really set?
  2. do some IO actions to fill the attribute
  3. return the contents of the variable in question

I dislike that, because all my getters would contain a lot of code. Instead of that I perform my IO operations in a central location, the constructor, and fill all my attributes.

Whats a proper way of doing this?

+4  A: 

I'm not a Python developer, but in general, it's best to avoid complex/error-prone operations in your constructor. One way around this would be to put a "LoadFromFile" or "Init" method in your class to populate the object from an external source. This load/init method must then be called separately after constructing the object.

Andy White
Alright, i guess you are right. I hoped there would be a nicer approad, because I guess then I have to go through all of these checks. guess my code will look like that then: 1. empty constructor2. some fileLoading method3. every getter making a call whether the file has been loaded before doing its job<br>ugly, but I guess thats the only possible way
Tom
The factory idea suggested by laalto might be a good idea too. The factory will hide the 2-phased construction, so your app code doesn't need to know about it. (I'd upvote his answer, but I'm out of votes today :). I still stand by my statement that it's better to not throw exceptions from a constructor.
Andy White
+2  A: 

One common pattern is two-phase construction, also suggested by Andy White.

First phase: Regular constructor.

Second phase: Operations that can fail.

Integration of the two: Add a factory method to do both phases and make the constructor protected/private to prevent instantation outside the factory method.

Oh, and I'm neither a Python developer.

laalto
+15  A: 

In C++ at least, there is nothing wrong with putting failure-prone code in the constructor - you simply throw an exception if an error occurs. If the code is needed to properly construct the object, there reallyb is no alternative (although you can abstract the code into subfunctions, or better into the constructors of subobjects). Worst practice is to half-construct the object and then expect the user to call other functions to complete the construction somehow.

anon
Objective-C uses this 2-phase construction pattern pretty much everywhere for creating objects, so it must have some merit.
Andy White
Throwing an exception in C++ constructor does not call the destructor (since the object is not yet constructed). You'll have to make sure you clean up whatever you did manage to construct before throwing. auto_ptrs are handy.
laalto
@Andy I can't speak for Objective C, but this approach generally means storing flags of some sort to indicate construction status - this is error prone and (in C++ at least) unecessary.
anon
@laalto As you say, use of aotu_ptr and (better still) non-pointer sub-objects solves this problem. The alternative approach has exactly the same problems, except you now need to know when NOT to destroy a sub object.
anon
Yeah, I agree that it's ugly. I like the idea of hiding this in a factory, maybe that makes it a little better...
Andy White
Tom
'Tom When people say "I heard ..." one would like to know where they heard it. There is absolutely nothing wrong with throwing exceptions from a constructor. C++ does not have garbage collection, so this is not a GC issue. And if you ever move to C, your whole programming style will have to change anyway.
anon
Yes thats true Neil. "I heard" was related to many different locations, as usually i dont give much on single statements. I just kept hearing this on forums, computer science lectures, many different places. My whole style will change indeed, but its better to get used to some common practises, because some things wont change if you have a solid understand of what you are doing. I have been coding for quite a while now, but some patterns are pretty much language independant, and the two phase thing sounds like a solid pattern to build on and to remember, thanks to all of you :)
Tom
The two phase thing is neither solid nor good.
anon
hm now I'm getting into a dilemma of not knowing whats the "correct" solution (if you can call it that, as there's always more than one).
Tom
One way of deciding is to see how much code you have to write to implement the two different approaches - choose the one that requires the least code.
anon
Why is this answer overrated?Two-phase construction might be ugly in C++ unless using factories but not freeing resources by half-constructing object is wrong.RAII is both correct and readable.
lispmachine
A: 

If the code to initialise the various values is really extensive enough that copying it is undesirable (which it sounds like it is in your case) I would personally opt for putting the required initialisation into a private method, adding a flag to indicate whether the initialisation has taken place, and making all accessors call the initialisation method if it has not initialised yet.

In threaded scenarios you may have to add extra protection in case initialisation is only allowed to occur once for valid semantics (which may or may not be the case since you are dealing with a file).

jerryjvl
A: 

Again, I've got little experience with Python, however in C# its better to try and avoid having a constructor that throws an exception. An example of why that springs to mind is if you want to place your constructor at a point where its not possible to surround it with a try {} catch {} block, for example initialisation of a field in a class:

class MyClass
{
    MySecondClass = new MySecondClass();
    // Rest of class
}

If the constructor of MySecondClass throws an exception that you wish to handle inside MyClass then you need to refactor the above - its certainly not the end of the world, but a nice-to-have.

In this case my approach would probably be to move the failure-prone initialisation logic into an initialisation method, and have the getters call that initialisation method before returning any values.

As an optimisation you should have the getter (or the initialisation method) set some sort of "IsInitialised" boolean to true, to indicate that the (potentially costly) initialisation does not need to be done again.

In pseudo-code (C# because I'll just mess up the syntax of Python):

class MyClass
{
    private bool IsInitialised = false;

    private string myString;

    public void Init()
    {
        // Put initialisation code here
        this.IsInitialised = true;
    }

    public string MyString
    {
        get
        {
            if (!this.IsInitialised)
            {
                this.Init();
            }

            return myString;
        }
    }
}

This is of course not thread-safe, but I don't think multithreading is used that commonly in python so this is probably a non-issue for you.

Kragen
+2  A: 

It is not bad practice per se.

But I think you may be after a something different here. In your example the doSomething() method will not be called when the MyClass constructor fails. Try the following code:

class MyClass:
def __init__(self, s):
    print s
    raise Exception("Exception")

def doSomething(self):
    print "doSomething"

try:
    someInstance = MyClass("test123")
    someInstance.doSomething()
except:
    print "except"

It should print:

test123
except

For your software design you could ask the following questions:

  • What should the scope of the someInstance variable be? Who are its users? What are their requirements?

  • Where and how should the error be handled for the case that one of your 10 values is not available?

  • Should all 10 values be cached at construction time or cached one-by-one when they are needed the first time?

  • Can the I/O code be refactored into a helper method, so that doing something similiar 10 times does not result in code repetition?

  • ...

Ralph
+9  A: 

There is a difference between constructor in C++ and __init__ method in Python. Task of the constructor is to construct object and if it fails no destructor is called. Therefore if any resources where acquired before exception was thrown the cleanup shall be done before exiting constructor. Thus some prefer two-phase construction with most of construction done outside constructor (ugh).

Python have much more clean two-phase construction (construct than initialize) but many people confuse __init__ method (initializer) with constructor. The actual constructor in Python is called __new__. Contrary to C++ it does not take an (half-constructed) instance but returns one. The __init__ task is to initialize the created instance. When exception is raised in __init__ the destructor (__del__) is called as expected because object was already created (not properly initialized but still) when __init__ was called.

Answering your question:

In Python, if code in your "constructor" fails, the object ends up not being defined.

That's not precise. If exception is raised __init__ the object is created but not initialized properly (e.g. some attributes are not assigned). But when it does you probably don't have any reference to this object. Only destructor (if any) needs to check whether attribute actualy exist.

Whats a proper way of doing this?

In Python initialize objects in __init__ and fear not about exceptions. In C++ use RAII.


Update [about resource management]:

In garbage collected languages if you are dealing with resources, especially limited ones such as database connections, it's better not to release them in destructor. This is because objects are destroyed in non deterministic way and if you happen to have a loop of references, which is not always easy to tell, and at least one of objects in loop have destructor defined they will never be destroyed. Garbage collected languages have other means of dealing with resources. In Python it's with statement.

lispmachine
Ahhh, at last, a anwser from somebody who actually codes in Python. Thks, I was going to say that. What's more, it's perfectly OK to surround your object create with a try catch...
e-satis
A: 

seems Neil had a good point: my friend just pointed me to this:

http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization

which is basically what Neil said...

Tom
But it's C++/Ada not Python related. Also in managed (garbage collected) languages resource acquisition/release is not performed by constructors/destructors.
lispmachine
A: 

This isn't an answer, it's a question to add to this. Is there a way after INIT is performed and before the object is returned to the user to run some kind of AFTER_INIT code? I have been working with Python for about two weeks now so I am not sure if there is.

For example, I am using a CacheFTPHandler from urllib2. Every call to urllib2 should use this cached handler. But currently each method call that uses urllib2 has to do the following:

    if not self.cached_ftp_handler_created:
        self.create_cached_ftp_handler()
    with closing(urllib2.urlopen(str)) as url:
         blah blah


    def create_cached_ftp_handler(self):
      #called each time a urllib2 is used, but should only create one handler...
      if not self.cached_ftp_handler_created:
        hdlr = urllib2.CacheFTPHandler()
        hdlr.setMaxConns(self.__max_number_of_ftp_connections)
        opener = urllib2.build_opener(hdlr)
        urllib2.install_opener(opener)
        self.cached_ftp_handler_created = True

Rather than constantly checking to see if the handler has been cached, I'd rather do it automatically after the INIT method completes. I can't do it IN the INIT method for the reasons mentioned above.

I could do it with a Factory as mentioned above, but wondered if there might be some other way?

Thanks,

Paul

TallPaul
Create new question (and refer to this one if you feel it's necessary) instead of asking in answers section.
PiotrLegnica
A: 

Thanks, jerryjvl. That's pretty much what I am doing!

TallPaul