views:

572

answers:

2

This is a follow-on question to http://stackoverflow.com/questions/2748866/c0x-rvalue-references-and-temporaries

In the previous question, I asked how this code should work:

void f(const std::string &); //less efficient
void f(std::string &&); //more efficient

void g(const char * arg)
{
    f(arg);
}

It seems that the move overload should probably be called because of the implicit temporary, and this happens in GCC but not MSVC (or the EDG front-end used in MSVC's Intellisense).

What about this code?

void f(std::string &&); //NB: No const string & overload supplied

void g1(const char * arg)
{
     f(arg);
}
void g2(const std::string & arg)
{
    f(arg);
}

It seems that, based on the answers to my previous question that function g1 is legal (and is accepted by GCC 4.3-4.5, but not by MSVC). However, GCC and MSVC both reject g2 because of clause 13.3.3.1.4/3, which prohibits lvalues from binding to rvalue ref arguments. I understand the rationale behind this - it is explained in N2831 "Fixing a safety problem with rvalue references". I also think that GCC is probably implementing this clause as intended by the authors of that paper, because the original patch to GCC was written by one of the authors (Doug Gregor).

However, I don't this is quite intuitive. To me, (a) a const string & is conceptually closer to a string && than a const char *, and (b) the compiler could create a temporary string in g2, as if it were written like this:

void g2(const std::string & arg)
{
    f(std::string(arg));
}

Indeed, sometimes the copy constructor is considered to be an implicit conversion operator. Syntactically, this is suggested by the form of a copy constructor, and the standard even mentions this specifically in clause 13.3.3.1.2/4, where the copy constructor for derived-base conversions is given a higher conversion rank than other user-defined conversions:

A conversion of an expression of class type to the same class type is given Exact Match rank, and a conversion of an expression of class type to a base class of that type is given Conversion rank, in spite of the fact that a copy/move constructor (i.e., a user-defined conversion function) is called for those cases.

(I assume this is used when passing a derived class to a function like void h(Base), which takes a base class by value.)

Motivation

My motivation for asking this is something like the question asked in http://stackoverflow.com/questions/2696156/how-to-reduce-redundant-code-when-adding-new-c0x-rvalue-reference-operator-over ("How to reduce redundant code when adding new c++0x rvalue reference operator overloads").

If you have a function that accepts a number of potentially-moveable arguments, and would move them if it can (e.g. a factory function/constructor: Object create_object(string, vector<string>, string) or the like), and want to move or copy each argument as appropriate, you quickly start writing a lot of code.

If the argument types are movable, then one could just write one version that accepts the arguments by value, as above. But if the arguments are (legacy) non-movable-but-swappable classes a la C++03, and you can't change them, then writing rvalue reference overloads is more efficient.

So if lvalues did bind to rvalues via an implicit copy, then you could write just one overload like create_object(legacy_string &&, legacy_vector<legacy_string> &&, legacy_string &&) and it would more or less work like providing all the combinations of rvalue/lvalue reference overloads - actual arguments that were lvalues would get copied and then bound to the arguments, actual arguments that were rvalues would get directly bound.

Clarification/edit: I realize this is virtually identical to accepting arguments by value for movable types, like C++0x std::string and std::vector (save for the number of times the move constructor is conceptually invoked). However, it is not identical for copyable, but non-movable types, which includes all C++03 classes with explicitly-defined copy constructors. Consider this example:

class legacy_string { legacy_string(const legacy_string &); }; //defined in a header somewhere; not modifiable.

void f(legacy_string s1, legacy_string s2); //A *new* (C++0x) function that wants to move from its arguments where possible, and avoid copying
void g() //A C++0x function as well
{
    legacy_string x(/*initialization*/);
    legacy_string y(/*initialization*/);

    f(std::move(x), std::move(y));
}

If g calls f, then x and y would be copied - I don't see how the compiler can move them. If f were instead declared as taking legacy_string && arguments, it could avoid those copies where the caller explicitly invoked std::move on the arguments. I don't see how these are equivalent.

Questions

My questions are then:

  1. Is this a valid interpretation of the standard? It seems that it's not the conventional or intended one, at any rate.
  2. Does it make intuitive sense?
  3. Is there a problem with this idea that I"m not seeing? It seems like you could get copies being quietly created when that's not exactly expected, but that's the status quo in places in C++03 anyway. Also, it would make some overloads viable when they're currently not, but I don't see it being a problem in practice.
  4. Is this a significant enough improvement that it would be worth making e.g. an experimental patch for GCC?
+1  A: 

Note that the call with an lvalue of char const* to the std::string && candidate is ill-formed. See your other question for an answer.

I don't quite see your point in this question. If you have a class that is movable, then you just need a T version:

struct A {
  T t;
  A(T t):t(move(t)) { }
};

And if the class is traditional but has a swap you write the swap version

struct A {
  T t;
  A(T t) { swap(this->t, t); }
};

But for the latter case, i would rather go with copying a const T& instead of that swap. The main advantage of the swap technique is exception safety. But what do you have to save if you are just constructing the object anyway? It's safe to throw here, so i would do

struct A {
  T t;
  A(T const& t):t(t) { }
};

To me, it seems disgusting to automatically convert a string lvalue to a rvalue copy of itself just to bind to a rvalue reference. An rvalue reference says it binds to rvalue. But if you try binding to an lvalue, it should just be ill-formed. After all, references should ideally be aliases. Introducing hidden copies to allow that doesn't sound right to me. Better fail out straight away so the user can fix his code.

Johannes Schaub - litb
Doug
Also, I see what you mean about references being aliases - but C++ already introduces hidden copies when binding to const lvalue ref arguments. I suppose I see a non-const rvalue reference argument as being similar to const lvalue reference arguments, which exhibit this "hidden copy" behavior, rather than non-const lvalue reference arguments.
Doug
The hidden copy when binding to const lvalue reference is not done anymore. I.e `istream const` is well-formed in C++0x. And the hidden copy for `string const` is essential for all the operator overloading to work. There is no similar pressing need to support `string `.If that would introduce a temporary string, it would go agains the principle of least surprise to me.
Johannes Schaub - litb
Firstly, thank you very much for your answers, they're really helpful. Secondly, I gave an example which violates least surprise for me on your answer to my previous question: that is, if `vector<T>::push_back` were declared with rvalue- and const lvalue-reference overloads (though I note this is no longer required), calling `vector<string>::push_back` with a `const char *` lvalue makes a conceptually-superfluous copy, because it would bind to the version accepting a const lvalue reference. (Maybe this simply means that `push_back` should accept its argument by value.)
Doug
@Doug i agree with you that ideally, `push_back` would take by value. It's not done for `vector<T>`, because that would do two copies for all the non-movable lvalues. But for `vector<T>` the solution is to use `emplace_back` which does zero copies and zero moves.
Johannes Schaub - litb
Thanks - I'm rapidly coming to the conclusion that rvalue ref args are a lot less useful than I'd hoped. I hoped that they were like an "optimized" const lvalue ref argument where the caller gives permission for the callee to steal/move resources from the argument. That's apparently not what happens, though, but I can't really believe that this behavior was completely intentional.
Doug
+1  A: 

What about this code?

void f(std::string &&); //NB: No const string & overload supplied

void g2(const std::string & arg)
{
    f(arg);
}

...However, GCC and MSVC both reject g2 because of clause 13.3.3.1.4/3, which prohibits lvalues from binding to rvalue ref arguments. I understand the rationale behind this - it is explained in N2831 "Fixing a safety problem with rvalue references". I also think that GCC is probably implementing this clause as intended by the authors of that paper, because the original patch to GCC was written by one of the authors (Doug Gregor)....

No, that's only half of the reason why both compilers reject your code. The other reason is that you can't initialize a reference to non-const with an expression referring to a const object. So, even before N2831 this didn't work. There is simply no need for a conversion because a string is a already a string. It seems you want to use string&& like string. Then, simply write your function f so that it takes a string by value. If you want the compiler to create a temporary copy of a const string lvalue just so you can invoke a function taking a string&&, there wouldn't be a difference between taking the string by value or by rref, would it?

N2831 has little to do with this scenario.

If you have a function that accepts a number of potentially-moveable arguments, and would move them if it can (e.g. a factory function/constructor: Object create_object(string, vector, string) or the like), and want to move or copy each argument as appropriate, you quickly start writing a lot of code.

Not really. Why would you want to write a lot of code? There is little reason to clutter all your code with const&/&& overloads. You can still use a single function with a mix of pass-by-value and pass-by-ref-to-const -- depending on what you want to do with the parameters. As for factories, the idea is to use perfect forwarding:

template<class T, class... Args>
unique_ptr<T> make_unique(Args&&... args)
{
    T* ptr = new T(std::forward<Args>(args)...);
    return unique_ptr<T>(ptr);
}

...and all is well. A special template argument deduction rule helps differentiating between lvalue and rvalue arguments and std::forward allows you to create expressions with the same "value-ness" as the actual arguments had. So, if you write something like this:

string foo();

int main() {
   auto ups = make_unique<string>(foo());
}

the string that foo returned is automatically moved to the heap.

So if lvalues did bind to rvalues via an implicit copy, then you could write just one overload like create_object(legacy_string &&, legacy_vector &&, legacy_string &&) and it would more or less work like providing all the combinations of rvalue/lvalue reference overloads...

Well, and it would be pretty much equivalent to a function taking the parameters by value. No kidding.

Is this a significant enough improvement that it would be worth making e.g. an experimental patch for GCC?

There's no improvement.

sellibitze
Ultimately, though, the factory function just forwards to e.g. a constructor, which still faces the same problem - to write lots of overloads, or to accept unnecessary copies. The constructor itself could be written as a template, but IMO changing every constructor that might want to do this into a template has other problems (including overloading ambiguities). Also, I agree that this would make rvalue refs similar to passing by value, *except* in the case where the type is a legacy type with no move constructor, and where you can't change that.
Doug
@Doug What "problem"? I have yet to see a real example.
sellibitze
My problem is writing a constructor of an object that takes e.g. 3 strings, that are copyable, not movable, but swappable. The constructor will assign args to member vars. The string class is written by a third party that isn't updating their libraries to C++0x - think Qt 3.x. I want to be able to write this constructor in a way that takes advantage of the rvalue-ness of its arguments to minimize copies (by swapping them in) where feasible. I don't want to make the constructor a template, and I don't want to write 8 overloads for what should be simple code. That is not currently possible.
Doug
@Doug Interesting. I think I had a similar train of thought while learning what rvalue references are about. But I came to the conclusion that in those cases pass-by-value should suffice. I just didn't anticipate the existence of "legacy types" with optimized swap but no move constructors. Actually, an older GCC version treated pass-by-value arguments like you want rrefs to behave. But someone filed a bug report ( http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36744 ) and this "feature" has been removed.
sellibitze
@Doug Also, good compilers do copy elision. Then, the only difference between pass-by-value and your proposed approach with rrefs is that with your approach you can avoid an unnecessary copy when the argument was an "xvalue" (see N3055). In all other cases, there won't be a difference due to copy elision.
sellibitze
@selibitze: That's an interesting bug. I see that bug as essentially, GCC trying to apply copy elision on a std::move'd, but non-movable argument, when it shouldn't. If `f` were instead declared with an rvalue ref argument, and the caller specifically moved `y` into `f`, then yes, I think `f` should modify `y`. In that case, the caller/callee have essentially agreed that the caller doesn't want `y` any more and so the callee can do what it pleases with it. I would not expect this when passing by value (which is what happens in the bug), though, for compatibility reasons.
Doug