views:

446

answers:

6

I recently read (and unfortunately forgot where), that the best way to write operator= is like this:

foo &operator=(foo other)
{
    swap(*this, other);
    return *this;
}

instead of this:

foo &operator=(const foo &other)
{
    foo copy(other);
    swap(*this, copy);
    return *this;
}

The idea is that if operator= is called with an rvalue, the first version can optimize away construction of a copy. So when called with a rvalue, the first version is faster and when called with an lvalue the two are equivalent.

I'm curious as to what other people think about this? Would people avoid the first version because of lack of explicitness? Am I correct that the first version can be better and can never be worse?

A: 

I would in general prefer the second, as it is more common and doesn't require a double-take. You are correct that the first version is never worse, but there is no reason that it must be better. Specifically, they may well perform exactly the same - only a single copy is made in either case, and the copy in the first case cannot be optimised away easily (or at all IMHO).

anon
Um, wouldn't a compiler have more opportunities to optimize the first one when the thing passed in is an rvalue? In that case it could actually eliminate the cctor call. It might be harder for it to do this in the second version.
sbi
My point is that the logic of swap seems to me to require a copy (unless swap does nothing, of course).
anon
These downvotes seem a bit over the top - we are discussing matters of opinion on code clarity, and on what compilers MIGHT do.
anon
@Neil - you are right that it doesn't have to be better. I wrote that accidentally when I should have written "it can be better." Updated the wording on my question.
R Samuel Klatchko
@Martin: While I certainly don't think Neil's answer is worthy of a downvote there are situations where the number of calls to copy constructors are different between the two. While the second explicitly performs a copy in the function body, the first merely requires a value parameter. Where the value is manufactured by the calling code, purely for the passing to the copy-assignment operator it can be 'manufactured' in the space for the parameter without a copy. It's certainly a legal optimization whereas optimizing out the copy for the other version may not be legal.
Charles Bailey
+1: I am with Neil. The code clarity of the second version wins it for me.
Martin York
@Martin - this is not about RVO. This is about parameter copy-elision (when the parameter is pass-by-value and is called with an rvalue).
R Samuel Klatchko
@Neil - both call swap, so there is no difference there (I'm discussing the copy-elision of the parameter). Anyway, you can specialize std::swap to avoid copies.
R Samuel Klatchko
I know both call swap. The swap idiom for assignment works by discarding the values of one of the things it swaps (otherwise operator = would perform a swap, rather than an assignment). Ergo, one of the things being swapped must be a copy.
anon
... or something that was going to be discarded anyway.
Charles Bailey
@Charles In other words, a temporary copy.
anon
@Neil - consider the case of calling operator= with an rvalue. The compile can use it's knowledge that this is an rvalue and reuse the rvalue rather then copy constructing it. This is not just theoretical as I've confirmed that gcc 4.2.1 does this optimization by default even with optimization turned off (-O0).
R Samuel Klatchko
@Neil. A temporary, not necessarily a copy. E.g. in `a = b + c`, `b + c` is a temporary, it need not be copied to be validly "swapped" with in a *cough* and swap idiom assignment operator.
Charles Bailey
A: 

I think that you might be confusing the difference between:

foo &operator=(const foo &other); and
const foo &operator=(const foo &other);

The first form should be used to allow for: (a = b) = c;

doron
I don't think he is confusing those - his question makes perfect sense.
anon
A: 

These two are actually the same. The only difference is where you press "Step In" in a debugger. And you should know in advance where to do it.

alemjerus
+2  A: 

I generally prefer the second one from readability and 'least surprise' point of view, however I do acknowledge that the first one can be more efficient when the parameter is a temporary.

The first one really can lead to no copies, not just the single copy and it's conceivable that this may be a genuine concern in extreme situations.

E.g. Take this test program. gcc -O3 -S (gcc version 4.4.2 20091222 (Red Hat 4.4.2-20) (GCC)) generates one call to B's copy constructor but no calls to A's copy constructor for the function f (the assignment operator is inlined for both A and B). A and B can both be taken to be very basic string classes. Allocation and copying for data would occur in the constructors and deallocation in the destructor.

#include <algorithm>

class A
{
public:
    explicit A(const char*);
    A& operator=(A val)      { swap(val); return *this; }
    void swap(A& other)      { std::swap(data, other.data); }
    A(const A&);
    ~A();

private:
    const char* data;
};

class B
{
public:
    explicit B(const char*);
    B& operator=(const B& val)  { B tmp(val); swap(tmp); return *this; }
    void swap(B& other)         { std::swap(data, other.data); }
    B(const B&);
    ~B();

private:
    const char* data;
};

void f(A& a, B& b)
{
    a = A("Hello");
    b = B("World");
}
Charles Bailey
It's noticeable that neither class has any data members. Also, could you edit your example to minimise vertical white space somewhat?
anon
Charles Bailey
Charles Bailey
Change your class into a simple string class (constructor, copy constructor, op= and destructor only) that uses to a dynamic array of char (only about 15 lines of code). I would do this myself, but frankly have had one too many glasses of wine. I think you will find find that copies cannot be so easily elided.
anon
@Neil: I've updated the example so that A and B can be taken to be very basic string classes. The results are unchanged in terms of the copy constructor counts but I think it makes the point a bit better.
Charles Bailey
A: 

Given this

foo &foo::operator=(foo other) {/*...*/ return *this;}
foo f();

in code like this

foo bar;
bar = f();

it might be easier for a compiler to eliminate the call to the copy constructor. With RVO, it could use the address of the operator's other parameter as the place for f() to construct its return value at.

It seems that this optimization is also possible for the second case, although I believe it might be harder. (Especially when the operator isn't inlined.)

sbi
+4  A: 

You probably read it from: http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

I don't have much to say since I think the link explains the rationale pretty well. Anecdotally I can confirm that the first form results in fewer copies in my builds with MSVC, which makes sense since compilers might not be able to do copy-elision on the second form. I agree that the first form is a strict improvement and is never worse than the second.

Edit: The first form might be a bit less idiomatic, but I don't think it's much less clear. (IMO, it's not any more surprising than seeing the copy-and-swap implementation of the assignment operator for the first time.)

Edit #2: Oops, I meant to write copy-elision, not RVO.

jamesdlin
I disagree with the first being idiomatic. This is the less common version. I think if you work for a big company that does lots of code review you will find yourself being sent back to do in the normal way. Normal is better for maintenance as people don't need to do a double take to work out what is happening. I think the first version is just a nice party trick to show how cool you are.
Martin York
I didn't quite say that the first form was idiomatic (and I did say that the second form was more so). But idioms have to start somewhere, and I know the second form made me do a double take the first time I saw it.
jamesdlin
@Martin: as Herb Sutter has said, when convention conflicts with good practice, we have to ask ourselves whether we should discard convention in favour of good practice, or discard good practice in favour of convention.
Steve Jessop
The first version isn't *strictly* better. As the article explains, in cases where operator= is not inlined, it's likely to result in the copy code (or the call to it) being emitted at each operator= call site, rather than being emitted once in the code for operator=. For a given class that's unlikely to result in significant code bloat, but it's possible.
Steve Jessop