views:

316

answers:

9

I've seen numerous arguments that using a return value is preferable to out parameters. I am convinced of the reasons why to avoid them, but I find myself unsure if I'm running into cases where it is unavoidable.

Part One of my question is: What are some of your favorite/common ways of getting around using an out parameter? Stuff along the lines: Man, in peer reviews I always see other programmers do this when they could have easily done it this way.

Part Two of my question deals with some specific cases I've encountered where I would like to avoid an out parameter but cannot think of a clean way to do so.

Example 1: I have a class with an expensive copy that I would like to avoid. Work can be done on the object and this builds up the object to be expensive to copy. The work to build up the data is not exactly trivial either. Currently, I will pass this object into a function that will modify the state of the object. This to me is preferable to new'ing the object internal to the worker function and returning it back, as it allows me to keep things on the stack.

class ExpensiveCopy //Defines some interface I can't change.
{
public:
    ExpensiveCopy(const ExpensiveCopy toCopy){ /*Ouch! This hurts.*/ };
    ExpensiveCopy& operator=(const ExpensiveCopy& toCopy){/*Ouch! This hurts.*/};

    void addToData(SomeData);
    SomeData getData();
}

class B
{
public:
    static void doWork(ExpensiveCopy& ec_out, int someParam);
    //or
    // Your Function Here.
}

Using my function, I get calling code like this:

const int SOME_PARAM = 5;
ExpensiveCopy toModify;
B::doWork(toModify, SOME_PARAM);

I'd like to have something like this:

ExpensiveCopy theResult = B::doWork(SOME_PARAM);

But I don't know if this is possible.

Second Example: I have an array of objects. The objects in the array are a complex type, and I need to do work on each element, work that I'd like to keep separated from the main loop that accesses each element. The code currently looks like this:

std::vector<ComplexType> theCollection;
for(int index = 0; index < theCollection.size(); ++index)
{
    doWork(theCollection[index]);
}

void doWork(ComplexType& ct_out)
{
   //Do work on the individual element.
}

Any suggestions on how to deal with some of these situations? I work primarily in C++, but I'm interested to see if other languages facilitate an easier setup. I have encountered RVO as a possible solution, but I need to read up more on it and it sounds like a compiler specific feature.

+6  A: 

I'm not sure why you're trying to avoid passing references here. It's pretty much these situations that pass-by-reference semantics exist.

The code

static void doWork(ExpensiveCopy& ec_out, int someParam);

looks perfectly fine to me.

If you really want to modify it then you've got a couple of options

  1. Move doWork so that's it's a member of ExpensiveCopy (which you say you can't do, so that's out)
  2. return a (smart) pointer from doWork instead of copying it. (which you don't want to do as you want to keep things on the stack)
  3. Rely on RVO (which others have pointed out is supported by pretty much all modern compilers)
Glen
Agreed. Only one thing, option 1 that you mention is not available, since the OP says that `ExpensiveCopy` cannot be modified.
stakx
@stakx, you're right, I missed that. Will fix that
Glen
@stakx: I know the OP mentioned that he can't modify the class. But why can't he subclass it and add the doWork() method there?
slebetman
+1  A: 

Unless you are going down the "everything is immutable" route, which doesn't sit too well with C++. you cannot easily avoid out parameters. The C++ Standard Library uses them, and what's good enough for it is good enough for me.

anon
I totally disagree with last statement: unfortunately C++ Standard Library is not near ideal (just from the top of my head: std::string members madness, over-elaborated streams, etc.) I don't want to start holiwar but I guess it's not best idea to justify something because of well-known example.
Alexander Poluektov
Of course it has its warts, but in general I find it extremely powerful and easy to use - I only wish my own code was as well designed.
anon
A: 

As to your first example: return value optimization will often allow the returned object to be created directly in-place, instead of having to copy the object around. All modern compilers do this.

Thomas
+3  A: 

Every useful compiler does RVO (return value optimization) if optimizations are enabled, thus the following effectively doesn't result in copying:

Expensive work() {
    // ... no branched returns here
    return Expensive(foo);
}

Expensive e = work();

In some cases compilers can apply NRVO, named return value optimization, as well:

Expensive work() {
    Expensive e; // named object
    // ... no branched returns here
    return e; // return named object
}

This however isn't exactly reliable, only works in more trivial cases and would have to be tested. If you're not up to testing every case, just use out-parameters with references in the second case.

Georg Fritzsche
+2  A: 

IMO the first thing you should ask yourself is whether copying ExpensiveCopy really is so prohibitive expensive. And to answer that, you will usually need a profiler. Unless a profiler tells you that the copying really is a bottleneck, simply write the code that's easier to read: ExpensiveCopy obj = doWork(param);.

Of course, there are indeed cases where objects cannot be copied for performance or other reasons. Then Neil's answer applies.

sbi
A: 

What platform are you working on?

The reason I ask is that many people have suggested Return Value Optimization, which is a very handy compiler optimization present in almost every compiler. Additionally Microsoft and Intel implement what they call Named Return Value Optimization which is even more handy.

In standard Return Value Optimization your return statement is a call to an object's constructor, which tells the compiler to eliminate the temporary values (not necessarily the copy operation).

In Named Return Value Optimization you can return a value by its name and the compiler will do the same thing. The advantage to NRVO is that you can do more complex operations on the created value (like calling functions on it) before returning it.

While neither of these really eliminate an expensive copy if your returned data is very large, they do help.

In terms of avoiding the copy the only real way to do that is with pointers or references because your function needs to be modifying the data in the place you want it to end up in. That means you probably want to have a pass-by-reference parameter.

Also I figure I should point out that pass-by-reference is very common in high-performance code for specifically this reason. Copying data can be incredibly expensive, and it is often something people overlook when optimizing their code.

Chris
+2  A: 

In addition to all comments here I'd mention that in C++0x you'd rarely use output parameter for optimization purpose -- because of Move Constructors (see here)

Alexander Poluektov
I actually recently tried to set up a system for supporting file reads that used move semantics once. Worked great, but I got burned in the end due to incompatibilities with different compilers. Had to change it all to out parameters which was a real bummer.
FP
A: 

As far as I can see, the reasons to prefer return values to out parameters are that it's clearer, and it works with pure functional programming (you can get some nice guarantees if a function depends only on input parameters, returns a value, and has no side effects). The first reason is stylistic, and in my opinion not all that important. The second isn't a good fit with C++. Therefore, I wouldn't try to distort anything to avoid out parameters.

The simple fact is that some functions have to return multiple things, and in most languages this suggests out parameters. Common Lisp has multiple-value-bind and multiple-value-return, in which a list of symbols is provided by the bind and a list of values is returned. In some cases, a function can return a composite value, such as a list of values which will then get deconstructed, and it isn't a big deal for a C++ function to return a std::pair. Returning more than two values this way in C++ gets awkward. It's always possible to define a struct, but defining and creating it will often be messier than out parameters.

In some cases, the return value gets overloaded. In C, getchar() returns an int, with the idea being that there are more int values than char (true in all implementations I know of, false in some I can easily imagine), so one of the values can be used to denote end-of-file. atoi() returns an integer, either the integer represented by the string it's passed or zero if there is none, so it returns the same thing for "0" and "frog". (If you want to know whether there was an int value or not, use strtol(), which does have an out parameter.)

There's always the technique of throwing an exception in case of an error, but not all multiple return values are errors, and not all errors are exceptional.

So, overloaded return values causes problems, multiple value returns aren't easy to use in all languages, and single returns don't always exist. Throwing an exception is often inappropriate. Using out parameters is very often the cleanest solution.

David Thornley
A: 

Ask yourself why you have some method that performs work on this expensive to copy object in the first place. Say you have a tree, would you send the tree off into some building method or else give the tree its own building method? Situations like this come up constantly when you have a little bit off design but tend to fold into themselves when you have it down pat.

I know in practicality we don't always get to change every object at all, but passing in out parameters is a side effect operation, and it makes it much harder to figure out what's going on, and you never really have to do it (except as forced by working within others' code frameworks).

Sometimes it is easier, but it's definitely not desirable to use it for no reason (if you've suffered through a few large projects where there's always half a dozen out parameters you'll know what I mean).

Charles Eli Cheese