tags:

views:

2302

answers:

11

Whilst refactoring some code I came across some getter methods that returns a std::string. Something like this for example:

class foo
{
private:
    std::string name_;
public:
    std::string name()
    {
        return name_;
    }
};

Surely the getter would be better returning a const std::string&? The current method is returning a copy which isn't as efficient. Would returning a const reference instead cause any problems?

+2  A: 

I'd change it to return const std::string&. The caller will probably make a copy of the result anyway if you don't change all the calling code, but it won't introduce any problems.

One potential wrinkle arises if you have multiple threads calling name(). If you return a reference, but then later change the underlying value, then the caller's value will change. But the existing code doesn't look thread-safe anyway.

Take a look at Dima's answer for a related potential-but-unlikely problem.

Kristopher Johnson
A: 

Depends what you need to do. Maybe you want to all the caller to change the returned value without changing the class. If you return the const reference that won't fly.

Of course, the next argument is that the caller could then make their own copy. But if you know how the function will be used and know that happens anyway, then maybe doing this saves you a step later in code.

Joel Coehoorn
I don't think that's a problem. A caller can just do "std::string s = aFoo.name()", and s will be a mutable copy.
Kristopher Johnson
+1  A: 

It is conceivable that you could break something if the caller really wanted a copy, because they were about to alter the original and wanted to preserve a copy of it. However it is far more likely that it should, indeed, just be returning a const reference.

The easiest thing to do is try it and then test it to see if it still works, provided that you have some sort of test you can run. If not, I'd focus on writing the test first, before continuing with refactoring.

Airsource Ltd
+12  A: 
Dima
Wouldn't break existing code, but wouldn't gain _any_ performance advantages. If problems were as easy to spot as in your example, that'd be fine, but it can often be much more tricky.. for example, if you have vector<foo>, and push_back, causing a resize, etc.. premature optimization, etc, etc.
tfinniga
Of course, you get a performance advantage. If you return the object, you pay the cost of creating an extra copy of the object, the temporary that is returned. You do not pay that cost if you return a reference.
Dima
@Dima: It is likely that most compilers will perform Named Value Return Value Optimisation in this case.
Richard Corden
@Richard: yes, I've read about that recently. But this only covers the case when you assign the returned value to something immediately, right? What if the returned value is passed to a function? Will the construction of the temporary still be optimized away?
Dima
+1  A: 

Odds are pretty good that typical usage of that function won't break if you change to a const reference.

If all of the code calling that function is under your control, just make the change and see if the compiler complains.

17 of 26
A: 

Since an std::string has copy-on-write semantics, there is not much to gain performance wise here. Returning const & is something I mostly try to avoid, especially when there is interface/implementation separation :

class IMyInterface
{
  public:
    virtual const string & GetDescription() const = 0 ;
};

I have now forced every derived class to have an instance of a string available.

What if I had an implementation like this :

string CreateGUIDAsString() ; // returns a GUID as a string

class CMyTestImpl : public IMyInterface
{
  private:
    virtual const string & GetDescription() const { return CreateGUIDAsString() ; }
};

This could not work because I'm returning a reference to a stack object.

QBziZ
Not all std::string has copy-on-write semantics, quite the contrary. It works quite badly on multithreaded code, and makes the implementation of operator[] and equivalent quite non-trivial, and source of bugs.
paercebal
Well, ok, I thought the standard defined the copy-on-write, apparently not. As for the rest of your comment : Works quite badly in the context of concurrency would then actually be a bug because it should, and non-trival implementation is not a problem a client of string needs to be worried about.
QBziZ
A: 

I normally return const& unless I can't. QBziZ gives an example of where this is the case. Of course QBziZ also claims that std::string has copy-on-write semantics which is rarely true today since COW involves a lot of overhead in a multi-threaded environment. By returning const & you put the onus on the caller to do the right thing with the string on their end. But since you are dealing with code that is already in use you probably shouldn't change it unless profiling shows that the copying of this string is causing massive performance problems. Then if you decide to change it you will need to test thouroughly to make sure you didn't break anything. Hopefully the other developers you work with don't do sketchy stuff like in Dima's answer.

Brett Hall
+1  A: 

Does it matter? As soon as you use a modern optimizing compiler, functions that return by value will not involve a copy unless they are semantically required to.

See the C++ lite FAQ on this.

christopher_f
Is return-by-value optimization so sophisticated that it would use a pointer/reference to the original member value i.o. of a copy? I don't think this is the case. Rbv optimization skips a few steps in returning a value but it would still call the copy ctor of cstring, albeit the in-place version.
QBziZ
+3  A: 

One problem for the const reference return would be if the user coded something like:

const std::string & str = myObject.getSomeString() ;

With a std::string return, the temporary object would remain alive and attached to str until str goes out of scope.

But what happens with a const std::string &? My guess is that we would have a const reference to an object that could die when its parent object deallocates it:

MyObject * myObject = new MyObject("My String") ;
const std::string & str = myObject->getSomeString() ;
delete myObject ;
// Use str... which references a destroyed object.

So my preference goes to the const reference return (because, anyway, I'm just more confortable with sending a reference than hoping the compiler will optimize the extra temporary), as long as the following contract is respected: "if you want it beyond my object's existence, they copy it before my object's destruction"

paercebal
+6  A: 

Some implementations of std::string share memory with copy-on-write semantics, so return-by-value can be almost as efficient as return-by-reference and you don't have to worry about the lifetime issues (the runtime does it for you).

If you're worried about performance, then benchmark it (<= can't stress that enough) !!! Try both approaches and measure the gain (or lack thereof). If one is better and you really care, then use it. If not, then prefer by-value for the protection it offers agains lifetime issues mentioned by other people.

You know what they say about making assumptions...

rlerallut
Instead you have to worry about thread safety when modifying unrelated strings :)
bk1e
Heh. What the runtime giveth, the multithreading taketh away. :)
rlerallut
Out of curiosity -- does g++ implement std::string with copy-on-write semantics?
+1  A: 

Okay, so the differences between returning a copy and returning the reference are:

  • Performance: Returning the reference may or may not be faster; it depends on how std::string is implemented by your compiler implementation (as others have pointed out). But even if you return the reference the assignment after the function call usually involves a copy, as in std::string name = obj.name();

  • Safety: Returning the reference may or may not cause problems (dangling reference). If the users of your function don't know what they are doing, storing the reference as reference and using it after the providing object goes out of scope then there's a problem.

If you want it fast and safe use boost::shared_ptr. Your object can internally store the string as shared_ptr and return a shared_ptr. That way, there will be no copying of the object going and and it's always safe (unless your users pull out the raw pointer with get() and do stuff with it after your object goes out of scope).