views:

493

answers:

5

Hi,

In general, let's say you have a method like the below.

def intersect_two_lists(self, list1, list2):
    if not list1:
        self.trap_error("union_two_lists: list1 must not be empty.")
        return False
    if not list2:
        self.trap_error("union_two_lists: list2 must not be empty.")
        return False
    #http://bytes.com/topic/python/answers/19083-standard
    return filter(lambda x:x in list1,list2)

In this particular method when errors are found, I would not want to return the empty list in this case because that could have been the real answer to this specific method call, I want to return something to indicate the parameters were incorrect. So I returned False on error in this case, and a list otherwise (empty or not).

My question is, what is the best practice in areas like this, and not just for lists?Return whatever the heck I want and make sure I document it for a user to read? :-) What do most of you folks do:

  1. If on success you were supposed to return True or False and you catch an error?
  2. If on success you were supposed to return a list and you catch an error?
  3. If on success you were supposed to return a file handle and you catch an error?
  4. et cetera

Thanks,

Paul

+3  A: 

I like to return a tuple:

(True, some_result)

(False, some_useful_response)

The *some_useful_response* object can be used for handling the return condition or can serve to display debugging info.

NOTE: this technique applies for return values of any sort. It should not be mistaken with exception cases.

On the receiving end, you just have to unpack:

Code, Response = some_function(...)

This technique applies for the "normal" control flow: one must use the exception functionality when some unexpected inputs / processing occur.

Also worth noting: this technique helps normalizing function returns. Both the programmer and the user of the functions know what to expect.

DISCLAIMER: I come from an Erlang background :-)

jldupont
Tuple, you mean. A set is something a bit different. :-)
kwatford
Yep Tuple... thanks!
jldupont
+1: interesting thought. I don't know why it never occurs to me to return `(status, result)` tuples.
D.Shawley
+1: I was about to answer the same :-D Exceptions are, as their name indicates, for exceptional situations, for validating inputs that you except to be faulty on a normal basis (e.g. input from keyboard), it's usually better to return an error code... remember the Zen: explicit is better than implicit! :-)
fortran
I can't believe this is being encouraged. It is a design nightmare to use error codes. Check Ned's article for more details: http://nedbatchelder.com/text/exceptions-vs-status.html
Nadia Alramli
@nadia: I believe you are probably mistaking *return value* with *exception*. I am proposing a *return value* technique.
jldupont
@nadia: returning tuples is another method and quite a valid one. Many of the "exception vs. status" arguments are specific to languages that make it difficult to return multiple things and capture the result - i.e., where you can't do `status, result = f()` and you are forced to do something silly like `result = f(` to achieve the same thing. Tuples are first class types in Python so we might as well use them. If this is done consistently throughout an API, it is quite elegant and efficient.
D.Shawley
+9  A: 

It'd be better to raise an exception than return a special value. This is exactly what exceptions were designed for, to replace error codes with a more robust and structured error-handling mechanism.

class IntersectException(Exception):
    def __init__(self, msg):
        self.msg = msg
    def __str__(self):
        return self.msg

def intersect_two_lists(self, list1, list2):
    if not list1: raise IntersectException("list1 must not be empty.")
    if not list2: raise IntersectException("list2 must not be empty.")

    #http://bytes.com/topic/python/answers/19083-standard
    return filter(lambda x:x in list1,list2)

In this specific case though I'd probably just drop the tests. There's nothing wrong with intersecting empty lists, really. Also lambda is sort of discouraged these days in preference to list comprehensions. See http://stackoverflow.com/questions/642763/python-intersection-of-two-lists for a couple of ways to write this without using lambda.

John Kugelman
+1: Exceptions rule.
S.Lott
This doesn't really test against empty lists only though. Consider calls like `intersect_two_lists(0, 0)` or `intersect_two_lists('', None)`. They are both incorrect but the exception would state that the `list1` was empty when it wasn't even the correct type.
D.Shawley
You are right this doesn't check specifically for empty list, but neither did the original code. Python style is more to go ahead and use an object rather than fret beforehand whether it is a specific type.
Ned Batchelder
+18  A: 

First, whatever you do don't return a result and an error message. That's a really bad way to handle errors and will cause you endless headaches. If you need to indicate an error always raise an exception.

I usually tend to avoid raising errors unless it is necessary. In your example throwing an error is not really needed. Intersecting an empty list with a non empty one is not an error. The result is just empty list and that is correct. But let's say you want to handle other cases. For example if the method got a non-list type. In this case it is better to raise an exception. Exception are nothing to be afraid of.

My advice for you is to look at the Python library for similar functions and see how Python handles those special cases. For example have a look at the intersection method in set, it tends to be forgiving. Here I'm trying to intersect an empty set with an empty list:

>>> b = []
>>> a = set()
>>> a.intersection(b)
set([])

>>> b = [1, 2]
>>> a = set([1, 3])
>>> a.intersection(b)
set([1])

Errors are only thrown when needed:

>>> b = 1
>>> a.intersection(b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable

Sure, there are cases where returning True or False on success or failure can be good. But it is very important to be consistent. The function should always return the same type or structure. It is very confusing to have a function that could return a list or a boolean. Or return the same type but the meaning of this value can be different in case of an error.

EDIT:

The OP says:

I want to return something to indicate the parameters were incorrect.

Nothing says there is an error better than an exception. If you want to indicate the parameters are incorrect then use exceptions and put a helpful error message. Returning a result in this case is just confusing. There might other cases where you want to indicate that nothing has happened but it is not an error. For example if you have a method that deletes entries from a table and the entry requested for deletion does not exist. In this case it might be fine to just return True or False on success or failure. It depends on the application and the intended behaviour

Nadia Alramli
@S.Lott, I said nothing that contradicts with this. But it is very important to look at the library for guidance to be consistent. There is nothing wrong with that and it should be encouraged.
Nadia Alramli
@S.Lott, I edited the answer to clear the exception point further.
Nadia Alramli
+1: returning a consistent result is definitely a requirement from the usability standpoint.
D.Shawley
I just looked at sets.py in Python 2.5.4. It seems to be returning an inconsistent return type in it's code in the __or__ method (and others). For example, if not isinstance(other, BaseSet): return NotImplemented return self.union(other)This will either return a constant (NotImplemented) or the union of the two sets. Am I reading this incorrectly?
TallPaul
Nadia Alramli
@TallPaul, another good explanation: http://jcalderone.livejournal.com/32837.html
Nadia Alramli
Nadia, thanks for your time and your engery in all of your explanations to a Python rookie!
TallPaul
+3  A: 

Exceptions are definitely better (and more Pythonic) than status returns. For much more on this: Exceptions vs. status returns

Ned Batchelder
+1, nice article.
Nadia Alramli
A: 

The general case is to throw exceptions for exceptional circumstances. I wish that I could remember the exact quote (or who said it), but you should strive for functions that accept as many values and types as is reasonable and maintain a very narrowly defined behavior. This is a variant of what Nadia was talking about. Consider the following usages of your function:

  1. intersect_two_lists(None, None)
  2. intersect_two_lists([], ())
  3. intersect_two_lists('12', '23')
  4. intersect_two_lists([1, 2], {1: 'one', 2: 'two'})
  5. intersect_two_lists(False, [1])
  6. intersect_two_lists(None, [1])

I would expect that (5) throws an exception since passing False is a type error. The rest of them, however, make some sort of sense but it really depends on the contract that the function states. If intersect_two_lists were defined as returning the intersection of two iterables, then everything other than (5) should work as long as you make None a valid representation of the empty set. The implementation would be something like:

def intersect_two_lists(seq1, seq2):
    if seq1 is None: seq1 = []
    if seq2 is None: seq2 = []
    if not isinstance(seq1, collections.Iterable):
        raise TypeError("seq1 is not Iterable")
    if not isinstance(seq2, collections.Iterable):
        raise TypeError("seq1 is not Iterable")
    return filter(...)

I usually write helper functions that enforce whatever the contract is and then call them to check all of the preconditions. Something like:

def require_iterable(name, arg):
    """Returns an iterable representation of arg or raises an exception."""
    if arg is not None:
        if not isinstance(arg, collections.Iterable):
            raise TypeError(name + " is not Iterable")
        return arg
    return []

def intersect_two_lists(seq1, seq2):
    list1 = require_iterable("seq1", seq1)
    list2 = require_iterable("seq2", seq2)
    return filter(...)

You can also extend this concept and pass in the "policy" as an optional argument. I would not advise doing this unless you want to embrace Policy Based Design. I did want to mention it just in case you haven't looked into this option before.

If the contract for intersect_two_lists is that it only accepts two non-empty list parameters, then be explicit and throw exceptions if the contract is breached:

def require_non_empty_list(name, var):
    if not isinstance(var, list):
        raise TypeError(name + " is not a list")
    if var == []:
        raise ValueError(name + " is empty")

def intersect_two_lists(list1, list2):
    require_non_empty_list('list1', list1)
    require_non_empty_list('list2', list2)
    return filter(...)

I think that the moral of the story is whatever you do, do it consistently and be explicit. Personally, I usually favor raising exceptions whenever a contract is breached or I am given a value that I really cannot use. If the values that I am given are reasonable, then I try to do something reasonable in return. You might also want to read the C++ FAQ Lite entry on exceptions. This particular entry gives you some more food for thought about exceptions.

D.Shawley
Very nice, thanks. D, how do you document what all exceptions anyone calling your methods would have to handle? What I mean is, if you put in the __doc__ section of the method 'raises exceptions TypeError and ValueError (via calls to require_non_empty_list, etc)' and then later you change/add/edit/delete the exceptions require_non_empty_list raises... how do you ensure that all methods using require_non_empty_list have been properly re-documented? Sorry for the wordy comment! :-)
TallPaul
Generally, the caller should expect any exception and not specific types. The key is that exceptions should have common meaning regardless of their context. Making the meaning of an exception depend on where it is generated is a recipe for disaster. This is one of the primary reasons that I reserve exceptions for exceptional circumstances. If there is a problem that can be _dealt with_ in the function, then you shouldn't be raising an exception you should be defining how the function behaves in the identified edge case.
D.Shawley
... oh... as for documentation. That is where you document the behavior in edge cases and the pre-conditions of the function. _"intersect_two_lists(Iterable,Iterable) -> list"_ states that you had better be sending two iterables in. If you don't `intersect_two_lists` is going to fail.
D.Shawley