views:

418

answers:

4

Question:

What is considered to be "Best practice" - and why - of handling errors in a constructor?.

"Best Practice" can be a quote from Schwartz, or 50% of CPAN modules use it, etc...; but I'm happy with well reasoned opinion from anyone even if it explains why the common best practice is not really the best approach.

As far as my own view of the topic (informed by software development in Perl for many years), I have seen three main approaches to error handling in a perl module (listed from best to worst in my opinion):

  1. Construct an object, set an invalid flag (usually "is_valid" method). Often coupled with setting error message via your class's error handling.

    Pros:

    • Allows for standard (compared to other method calls) error handling as it allows to use $obj->errors() type calls after a bad constructor just like after any other method call.

    • Allows for additional info to be passed (e.g. >1 error, warnings, etc...)

    • Allows for lightweight "redo"/"fixme" functionality, In other words, if the object that is constructed is very heavy, with many complex attributes that are 100% always OK, and the only reason it is not valid is because someone entered an incorrect date, you can simply do "$obj->setDate()" instead of the overhead of re-executing entire constructor again. This pattern is not always needed, but can be enormously useful in the right design.

    Cons: None that I'm aware of.

  2. Return "undef".

    Cons: Can not achieve any of the Pros of the first solution (per-object error messages outside of global variables and lightweight "fixme" capability for heavy objects).

  3. Die inside the constructor. Outside of some very narrow edge cases, I personally consider this an awful choice for too many reasons to list on the margins of this question.

  4. UPDATE: Just to be clear, I consider the (otherwise very worthy and a great design) solution of having very simple constructor that can't fail at all and a heavy initializer method where all the error checking occurs to be merely a subset of either case #1 (if initializer sets error flags) or case #3 (if initializer dies) for the purposes of this question. Obviously, choosing such a design, you automatically reject option #2.

+5  A: 

I prefer:

  1. Do as little initialization as possible in the constructor.
  2. croak with an informative message when something goes wrong.
  3. Use appropriate initialization methods to provide per object error messages etc

In addition, returning undef (instead of croaking) is fine in case the users of the class may not care why exactly the failure occurred, only if they got a valid object or not.

I despise easy to forget is_valid methods or adding extra checks to ensure methods are not called when the internal state of the object is not well defined.

I say these from a very subjective perspective without making any statements about best practices.

Sinan Ünür
Sinan - I debated including "light constructor and heavy initializer" approach into this question but firgured it is not different enough to mention... but I do agree it's a very worthy design that we frequently use.
DVK
@DVK Note that I am not advocating a colossal `sub initialize`. I think breaking down the steps into multiple methods is perfectly acceptable.
Sinan Ünür
+4  A: 

It depends on how you want your constructors to behave.

The rest of this response goes into my personal observations, but as with most things Perl, Best Practices really boils down to "Here's one way to do it, which you can take or leave depending on your needs." Your preferences as you described them are totally valid and consistent, and nobody should tell you otherwise.

I actually prefer to die if construction fails, because we set it up so that the only types of errors that can occur during object construction really are big, obvious errors that should halt execution.

On the other hand, if you prefer that doesn't happen, I think I'd prefer 2 over 1, because it's just as easy to check for an undefined object as it is to check for some flag variable. This isn't C, so we don't have a strong typing constraint telling us that our constructor MUST return an object of this type. So returning undef, and checking for that to establish success or failure, is a great choice.

The 'overhead' of construction failure is a consideration in certain edge cases (where you can't quickly fail before incurring overhead), so for those you might prefer method 1. So again, it depends on what semantics you've defined for object construction. For example, I prefer to do heavyweight initialization outside of construction. As to standardization, I think that checking whether a constructor returns a defined object is as good a standard as checking a flag variable.

EDIT: In response to your edit about initializers rejecting case #2, I don't see why an initializer can't simply return a value that indicates success or failure rather than setting a flag variable. Actually, you may want to use both, depending on how much detail you want about the error that occurred. But it would be perfectly valid for an initializer to return true on success and undef on failure.

Adam Bellaire
Should have said C++, as we are talking about object construction, but the point was the typing. ;)
Adam Bellaire
Adam - I plan to have another question about "die" in Perl modules, but suffice it to say that one of my objections is the bandwidth of "die" - you can only pass one string to the caller, as opposed to having an unlimited amount of info available from the object you constructed (list of errors, list of warnings, etc...). The reason I mention it is because it's the same reason I don't like returning undefs - no bandwidth (you don't even return an error string).
DVK
@DVK: Huh? Die can be given an object. That's a standard exception-handling metaphor in perl: `die someException->new()`. Then your calling context checks `$@` for the object and its contents.
Adam Bellaire
@DVK: In particular, we use that method for our constructors because we can die and analyze the exception object to see what was going on to cause the failure. While your methods are perfectly sound (using flag data/methods), avoiding `die` because you can only get a string isn't correct: you can get more.
Adam Bellaire
Check out `Exception::Class` as a handy way of throwing objects as exceptions. (http://search.cpan.org/perldoc?Exception::Class)
cjm
+1  A: 

First the pompous general observations:

  1. A constructor's job should be: Given valid construction parameters, return a valid object.
  2. A constructor that does not construct a valid object cannot perform its job and is therefore a perfect candidate for exception generation.
  3. Making sure the constructed object is valid is part of the constructor's job. Handing out a known-to-be-bad object and relying on the client to check that the object is valid is a surefire way to wind up with invalid objects that explode in remote places for non-obvious reasons.
  4. Checking that all the correct arguments are in place before the constructor call is the client's job.
  5. Exceptions provide a fine-grained way of propagating the particular error that occurred without needing to have a broken object in hand.
  6. return undef; is always bad[1]
  7. bIlujDI' yIchegh()Qo'; yIHegh()!

Now to the actual question, which I will construe to mean "what do you, darch, consider the best practice and why". First, I'll note that returning a false value on failure has a long Perl history (most of the core works that way, for example), and a lot of modules follow this convention. However, it turns out this convention produces inferior client code and newer modules are moving away from it.[2]

[The supporting argument and code samples for this turn out to be the more general case for exceptions that prompted the creation of autodie, and so I will resist the temptation to make that case here. Instead:]

Having to check for successful creation is actually more onerous than checking for an exception at an appropriate exception-handling level. The other solutions require the immediate client to do more work than it should have to just to obtain an object, work that is not required when the constructor fails by throwing an exception.[3] An exception is vastly more expressive than undef and equally expressive as passing back a broken object for purposes of documenting errors and annotating them at various levels in the call stack.

You can even get the partially-constructed object if you pass it back in the exception. I think this is a bad practice per my belief about what a constructor's contract with its clients ought to be, but the behavior is supported. Awkwardly.

So: A constructor that cannot create a valid object should throw an exception as early as possible. The exceptions a constructor can throw should be documented parts of its interface. Only the calling levels that can meaningfully act on the exception should even look for it; very often, the behavior of "if this construction fails, don't do anything" is exactly correct.

[1]: By which I mean, I am not aware of any use cases where return; is not strictly superior. If someone calls me on this I might have to actually open a question. So please don't. ;)
[2]: Per my extremely unscientific recollection of the module interfaces I've read in the last two years, subject to both selection and confirmation biases.
[3]: Note that throwing an exception does still require error-handling, as would the other proposed solutions. This does not mean wrapping every instantiation in an eval unless you actually want to do complex error-handling around every construction (and if you think you do, you're probably wrong). It means wrapping the call which is able to meaningfully act on the exception in an eval.

darch
I haven't analyzed this response in detail yet but I must very majorly disagree with: "4. Checking that all the correct arguments are in place before the constructor call is the client's job". This 100% completely violates any and all OO principles as far as I'm concerned. The only code with the knowledge of and the functionality to check the arguments to a constructor should be the class where the constructor lives.
DVK
The client's job is to make sure that it only uses the class' interface in supported ways, including passing in all the arguments the interface claims it requires to function. Checking the *validity* of the arguments is an appropriate job for the invokee, but the invokee is under no obligation to do anything if his interface is violated, and especially not to pass back an object that violates his class invariants because the caller could not be bothered to make the call correctly.
darch
+2  A: 

I would recommend against #1 simply because it leads to more error handling code which will not be written. For example, if you just return false then this works fine.

my $obj = Class->new or die "Construction failed...";

But if you return an object which is invalid...

my $obj = Class->new;
die "Construction failed @{[ $obj->error_message ]}" if $obj->is_valid;

And as the quantity of error handling code increases the probability of it being written decreases. And its not linear. By increasing the complexity of your error handling system you actually decrease the amount of errors it will catch in practical use.

You also have to be careful that your invalid object in question dies when any method is called (aside from is_valid and error_message) leading to yet more code and opportunities for mistakes.

But I agree there is value in being able to get information about the failure, which makes returning false (just return not return undef) inferior. Traditionally this is done by calling a class method or global variable as in DBI.

my $dbh = DBI->connect($data_source, $username, $password) or die $DBI::errstr;

But it suffers from A) you still have to write error handling code and B) its only valid for the last operation.

The best thing to do, in general, is throw an exception with croak. Now in the normal case the user writes no special code, the error occurs at the point of the problem, and they get a good error message by default.

my $obj = Class->new;

Perl's traditional recommendations against throwing exceptions in library code as being impolite is outdated. Perl programmers are (finally) embracing exceptions. Rather than writing error handling code ever and over again, badly and often forgetting, exceptions DWIM. If you're not convinced just start using autodie (watch pjf's video about it) and you'll never go back.

Exceptions align Huffman encoding with actual use. The common case of expecting the constructor to just work and wanting an error if it doesn't is now the least code. The uncommon case of wanting to handle that error requires writing special code. And the special code is pretty small.

my $obj = eval { Class->new } or do { something else };

If you find yourself wrapping every call in an eval you are doing it wrong. Exceptions are called that because they are exceptional. If, as in your comment above, you want graceful error handling for the user's sake, then take advantage of the fact that errors bubble up the stack. For example, if you want to provide a nice user error page and also log the error you can do this:

eval {
    run_the_main_web_code();
} or do {
    log_the_error($@);
    print_the_pretty_error_page;
};

You only need it in one place, at top of your call stack, rather than scattered everywhere. You can take advantage of this at smaller increments, for example...

my $users = eval { Users->search({ name => $name }) } or do {
    ...handle an error while finding a user...
};

There's two things going on. 1) Users->search always returns a true value, in this case an array ref. That makes the simple my $obj = eval { Class->method } or do work. That's optional. But more importantly 2) you only need to put special error handling around Users->search. All the methods called inside Users->search and all the methods they call... they just throw exceptions. And they're all caught at one point and handled the same. Handling the exception at the point which cares about it makes for much neater, compact and flexible error handling code.

You can pack more information into the exception by croaking with a string overloaded object rather than just a string.

my $obj = eval { Class->new }
  or die "Construction failed: $@ and there were @{[ $@->num_frobnitz ]} frobnitzes";

Exceptions:

  • Do the right thing without any thought by the caller
  • Require the least code for the most common case
  • Provide the most flexibility and information about the failure to the caller

Modules such as Try::Tiny fix most of the hanging issues surrounding using eval as an exception handler.

As for your use case where you might have a very expensive object and want to try and continue with it partially build... smells like YAGNI to me. Do you really need it? Or you have a bloated object design which is doing too much work too early. IF you do need it, you can put the information necessary to continue the construction in the exception object.

Schwern
@Schwern - The answers to this Q convinced me to re-evaluate my outlook on using "die"... My major remaining concern is - are there any performance implications to using evals around so much code? (I'll probably post that as a separate SO question, so please consider this a rhetorical comment as opposed to request for info :) )
DVK
@DVK By "evals around so much code" do you mean the quantity of times you write `eval` or the amount of code run inside a single `eval`? If its the former, you're doing it wrong; you should not be writing eval so many times that its a performance concern. If its the latter, it should not matter. Remember `eval BLOCK` and `eval STRING` are completely different beasties.
Schwern
@Schwern - the latter. I'm not familiar with details eval BLOCK other than how to use it, thus not certain. I'll post the Q after some research.
DVK