views:

235

answers:

6

I have general question regarding the use of pointers vs. references in this particular scenario.

Let's say that I have a function that is going to do some computation and store the value inside an object for later use by the caller. I can implement this by using either pointers or references.

Although, I would prefer using references because I trying avoiding pointers as much as possible, are there any pros/cons of one approach over the other.

The code using Pointers would be as follows:

Node*& computeNode() {  
  // Do some computation before creating a node object.  
  Node* newNode = new Node;  
  newNode->member1 = xyz;  
  newNode->member2 = abc;  
  // and so on ...  
  return newNode;  
}

The code using references could do something like this:

void computeNode(Node& newNode) {  
   // Do some computation before assigning values to the node object.  
   newNode.member1 = xyz;  
   newNode.member2 = abc;  
   // and so on.  
}

The differences that I can see are as follows:

  1. When using the pointer method, the newNode object is allocated on the Heap. So, unless I call delete on it, it is not going to get deleted. However, in the reference method, whether newNode is allocated on the Heap/Stack depends on what the caller did to create the newNode object.

  2. Whenever we use references, the number of arguments needed to pass to the function increases by at least 1. This is fine, only I find it a bit counter-intuitive to pass the return object also to a function call unless I name the function in such a way that it becomes obvious to the API user.

  3. By using references, I can simulate the return of multiple objects. In the pointer method, I think I will have to wrap all the objects in another structure (like a pair class) and then return it. That increases the overhead.

However, I do not know if usually one is preferred over the other. And if there are any function naming conventions in C++ that let the developer know that he is supposed to pass the return object also as an argument.

A: 

The second approach is probably preferable because there is no possibility of a memory leak, in the event you forget to delete the returned pointer.

It's usually good practice to code in such a way that each function or object which allocates heap memory also deallocates that memory. Your first example violates that practice, making it the function caller's responsibility to deallocate the memory. This makes memory leaks more likely, because now every time the function is called there is another opportunity to forget to delete the returned pointer.

You may also want to consider returning the object by value (which will return a copy of the object) in cases where the size of the object is not that large. Even though this will require a copy to be created, if the object is not so large it won't impact performance. (This method will become a lot more attractive in the future with C++0x move semantics.)

Charles Salvia
Returning an `auto_ptr` would not cause any memory leak either. The only problem with allocating memory on the heap is to clearly define the owner of that memory... and enforce it using the language rather than relying on (unread) comments.
Matthieu M.
A: 

I think your first option should be returning by value (or perhaps make the constructor compute the members?):

Node computeNode()
{
    Node n;
    n.x = abc;
    n.y = xyz;
    return n;
}

This may look inefficient, but it is quite possible that copying is elided with NRVO.

If the Node needs to be dynamically allocated anyway, you should return the pointer by value (a copy of the pointer):

Node* computeNode();

Otherwise you will be returning a reference to a local variable (pointer).

UncleBens
But does not return by value call the copy constructor?Unless I override my copy constructor to do deep copying, the return value might be wrong? Isn't it?
ajay
@ajay: yes, but you should make proper copy constructors anyway
Charles Salvia
@ajay: If the return value is wrong because of copying, then your class (more specifically, its copy constructor) is broken and you must fix it.
Konrad Rudolph
Or disable it, in which case this won't be an option.
UncleBens
+1  A: 

I prefer using the second approach to send back information (as you said, allows for multiple "returns" without using an extra structure) and generally return an error or a success code.

Also, I set the purely input arguments as const & to distinguish between the input and output variables.

Jacob
Output arguments are always ugly. Why use them needlessly? Error codes should usually be signalled using an exception, or, in some occasions, by returning a pair (such as in `std::map::insert`).
Konrad Rudolph
True, but it's just more consistent when you have several functions with some of them requiring more than one return. Then, it's kinda haphazard - I like enforcing some sort of consistency so that the user can always expect an error code and the required data as an argument.
Jacob
+5  A: 

You could try returning an auto_ptr or shared_ptr. That would eliminate the issues with delete.

KitsuneYMG
`auto_ptr` first, only ever use a `shared_ptr` if you need a shared resource.
Matthieu M.
+1  A: 

If the alternatives are really as given, it’s not clear why you need a reference/pointer at all; you could also just return by value:

Node computeNode() {
    // Do some computation before creating a node object.
    Node newNode;
    newNode.member1 = xyz;
    newNode.member2 = abc;
    return newNode;
}

Despite what many people think, this isn’t actually very inefficient because the compiler can (and will!) elide most of the unnecessary copies.

Semantically, this is the solution that you want, unless the node gets stored somewhere else as well and you need to preserve reference identity.

Konrad Rudolph
A: 

You can return by value and avoid copies in some situations, by using const references like this :

    Node computeNode() {
// Do some computation before creating a node object.
Node newNode;
newNode.member1 = xyz;
newNode.member2 = abc;
return newNode;
}

const Node &n = computeNode();

The lifetime of the temporary object in computeNode is extended upto the scope of the reference n

rep_movsd