views:

286

answers:

11

I have seen conflicting advice on whether the following code is better

def function():
    ret_val = 0
    if some_condition():
        ret_val = 2
    else:
        ret_val = 3
    return ret_val

or whether this is better:

def function():
    if some_condition():
        return 2
    else:
        return 3

This is a simple example, and I've written it in python-style, but what I'm looking for is a general principle as to when to use some "accumulator" variable to keep track of the return value, or whether to use multiple exit points. I know different languages might have different reasons for using one style over the other, so I'd appreciate differing viewpoints for why a particular language might stick with a particular style. (Specifically in the past I've heard that structured programming in C avoids having multiple exit points for a function.)

+1  A: 

It depends on the language to a large extent, however I would go with the second method returning the value directly rather than imposing another superfluous variable. The second method is cleaner, more precise and therefore more maintainable in my opinion.

ennuikiller
+5  A: 

In Python, it is quite common to have a return statement in the middle of the function - in particular, if it is an early exit. Your example often is rewritten as

def function():
    if some_condition():
        return 2
    return 3

I.e. you drop the else case when the if ends with a return.

Martin v. Löwis
A: 

In a language with function prototypes like C++ or Java, the compiler enforces that you return something of the correct type, even if execution would otherwise fall off the end of the function. In Python, since there are no function prototypes, falling off the end of the function will return the special value None. For this reason, you may want to use an accumulator variable and an explicit return ret_val at the end when coding in Python. Or use another style that ensures that execution cannot fall off the end without returning a value.

Greg Hewgill
A: 

Returning values directly is not terrible for small functions like your example. However, if you have a large or complex function then multiple return points can be more difficult to debug. If you have a coding standard I'd refer to it (here the variable is preferred according to our company coding standard).

John D.
Multiple return points are usually easier to debug *especially* for large functions, because trying to avoid them typically leads to unnecessary nesting. Only stick your return value in a variable if it actually makes sense for the code.
Glenn Maynard
The point of having only one return point is to make it easy to cleanup any memory that was allocated within the function. It has nothing to do with looking pretty. If you have multiple return points you'll likely need the extra nesting to free allocated memory anyway.
John D.
+1  A: 

I suppose it's more a question of style and coding conventions. Generally, theory tells us that multiple exit points are bad. In practice it can be easier to follow to simply return inside each condition. The code is likely to be compiled down to very similar if not identical instructions, so it has little to no functional impact.

My rule of thumb is this: If the function is longer than one page (25 lines) avoid multiple exit points. If you can see it all at once, do whatever seems best at the time you write it.

Sorpigal
+5  A: 

Don't use an accumulator unless it's absolutely unavoidable. It introduces unnecessary statefulness and branching into your procedures, which you then have to track manually. By returning early, you can reduce the state and branch count of your code.

Specifically in the past I've heard that structured programming in C avoids having multiple exit points for a function.

Precisely the opposite -- structured programming discourages multiple points of entry, but multiple points of exit are acceptable and even encouraged (eg "guard clauses").

John Millikin
+1. I dislike it strongly when people still stick to the adage of "single exit point". This is a technique that never particularly made sense to me. Since methods should be short, you should be able to see all exit points on the screen anyways. Hear, Hear, to this.
snicker
@snicker: the "single exit point" philosophy is a cargo cult mis-representation of various good ideas, including Dijkstra's warning against the GOTO statement. If you ever meet somebody who encourages the use of accumulators, it is safe to ignore their opinions.
John Millikin
Noted. In the spirit of controversial analogies, this is similar to when people interpret religious texts as fuel to incite war.
snicker
Despite what's said in the comments earlier, mutiple exit points have a major drawback: if you allocate a resource in a function that must be deallocated on exit from this function, you'll have to add code to deallocate it before each and every exit point, and if you have many of them, you will forget to do this properly, sooner or later. This is especially important for code that is developed by more than one person
dmityugov
@dmityugov: Every modern language has the ability to automatically manage resource lifetimes, typically based on static scope. If you're using assembly or C, structured programming practices are more gentle recommendations then rules.
John Millikin
the question wasn't about modern languages only I am afraid
dmityugov
@dmityugov: if you're using a legacy language, then just use GOTO -- there's absolutely nothing wrong with using GOTO in a language like C.
John Millikin
+1  A: 

A further alternative in recent versions of Python (since 2.6?) is a ternary operator statement like this:

def function():
    return (2 if some_condition() else 3)

Just in case you like that better.

hughdbrown
you should drop the parentheses around the return expression though.
Martin v. Löwis
Such an absurd syntax structure; whoever came up with this out-of-order mess was smoking *something*.
Glenn Maynard
@Glenn Maynard: It reads pretty well. @Martin v. Löwis: yeah, you can remove it. I end up getting pushed into using parentheses most times I use this, though, so I've fallen into using them all the time.
hughdbrown
@Glenn: it probably originated with Perl (though I don't know if Perl stole it from somewhere else). They have the "unless" construct: return 5 unless someCondition;
rmeador
+1  A: 

For primitives, it doesn't matter. In a language like C++ (& presumably with structs in C it's the compiler will do something similar), the compiler is able to optimize the copy constructor out if you ensure all code paths return the same variable. For example:

Foo someFunction()
{
    Foo result(5);
    if (someConditionA())  return result;
    else if (someConditionB()) result.doSomething();
    result.doSomethingElse();
    return result;
}

becomes more efficient than (unless your compiler is very very good):

Foo someFunction()
{
   if (someConditionA()) return Foo(5);
   if (someConditionB()) { Foo result(5); result.doSomething(); result.doSomethingElse(); return result; }
   Foo result(5);
   result.doSomethingElse();
   return result;
}

In all other cases, it's more style-preference & readability. In the end, choose the format that's more readable for that particular case.

Vitali
+1  A: 

Although people advocate single exit strategy, I find it useful to return early. That way you don't have to keep track when you are adding code later.

+2  A: 

Stylistics aside, let's take a look at the disassembly for the two approaches:

>>> def foo():
...     r = 0
...     if bar():
...             r = 2
...     else:
...             r = 3
...     return r
... 
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 (0)
              3 STORE_FAST               0 (r)

  3           6 LOAD_GLOBAL              0 (bar)
              9 CALL_FUNCTION            0
             12 JUMP_IF_FALSE           10 (to 25)
             15 POP_TOP             

  4          16 LOAD_CONST               2 (2)
             19 STORE_FAST               0 (r)
             22 JUMP_FORWARD             7 (to 32)
        >>   25 POP_TOP             

  6          26 LOAD_CONST               3 (3)
             29 STORE_FAST               0 (r)

  7     >>   32 LOAD_FAST                0 (r)
             35 RETURN_VALUE

14 bytecode instructions in the first approach...

>>> def quux():
...     if bar():
...             return 2
...     else:
...             return 3
... 
>>> dis.dis(quux)
  2           0 LOAD_GLOBAL              0 (bar)
              3 CALL_FUNCTION            0
              6 JUMP_IF_FALSE            5 (to 14)
              9 POP_TOP             

  3          10 LOAD_CONST               1 (2)
             13 RETURN_VALUE        
        >>   14 POP_TOP             

  5          15 LOAD_CONST               2 (3)
             18 RETURN_VALUE        
             19 LOAD_CONST               0 (None)
             22 RETURN_VALUE

11 in the second approach...

And a third approach, slightly shorter than the second:

>>> def baz():
...     if bar():
...             return 2
...     return 3
... 
>>> dis.dis(baz)
  2           0 LOAD_GLOBAL              0 (bar)
              3 CALL_FUNCTION            0
              6 JUMP_IF_FALSE            5 (to 14)
              9 POP_TOP             

  3          10 LOAD_CONST               1 (2)
             13 RETURN_VALUE        
        >>   14 POP_TOP             

  4          15 LOAD_CONST               2 (3)
             18 RETURN_VALUE

Has just nine instructions. The differences may not seem like much, but it actually makes a bit of a difference over a million runs with timeit, with bar defined to return alternating zeros and ones:

$ sudo nice -n -19 python b.py
('foo', 1.3846859931945801)
('quux', 1.282526969909668)
('baz', 1.2973799705505371)
$ sudo nice -n -19 python b.py
('foo', 1.354640007019043)
('quux', 1.2609632015228271)
('baz', 1.2767179012298584)

$ sudo nice -n -19 python3 b.py
foo 1.72521305084
quux 1.62322306633
baz 1.62547206879
$ sudo nice -n -19 python3 b.py
foo 1.73264288902
quux 1.67029309273
baz 1.62204194069

quux and baz tended to be close to the same time, both of which were consistently faster than foo.

If you're still on the fence about which one is better, hopefully this illustrates another advantage of the accumulator-less approach that nobody else mentioned so far.

Mark Rushakoff
+7  A: 

Did we forget why "multiple exit points" was considered harmful in the first place? Back in the day (before widespread access to good exception handling and finally constructs, or managing objects like auto_ptr that do cleanup when they leave scope), this was the problem that haunted many multi-exit functions:

int function blah(arg1, arg2)
    allocate resource

    if early failure detection
        return failure_status

    ... much later...

    release resource // oh rats! resource didn't release
    return success_status

If the resource is memory, this creates a memory leak. If it's a database transaction, we are heading for bad database contention or deadlock. For that matter, with the advent of more exception support, we implicitly add many potential exits from a method (by virtue of an unhandled exception). In my C++ days, I developed the habit of never calling delete, but instead using auto_ptr, so that allocated memory was cleaned up when the auto_ptr exited its scope, even if some unexpected exception reared its head.

In our garbage collected Python world, we can still have this issue, even though many of our objects, such as files, or locks, have improved self-cleaning behavior. But in implementations other than CPython (jython and IronPython to name two), there is no guarantee just when a destructor will get called, so something more proactive needs to be built into your method. The first mechanism for this purpose was try/finally:

int function blah(arg1, arg2)
    allocate resource
    try:

        if early failure detection
            return failure_status

        ... much later...
        return success_status

    finally:
        release resource // always releases no matter what

But now Python has context managers, in conjunction with the new 'with' syntax:

int function blah(arg1, arg2)
    allocate resource
    with context_manager(resource): // releases on exit from 'with'

        if early failure detection
            return failure_status

        ... much later...
        return success_status

So let's be sure that we tell the whole story, that the reason we can discard this old chestnut is that newer coding practices make it unnecessary.

Paul McGuire