tags:

views:

314

answers:

6

Hi, I'm a TA for an intro C++ class. The following question was asked on a test last week:

What is the output from the following program:

int myFunc(int &x) {
   int temp = x * x * x;
   x += 1;
   return temp;
}

int main() {
   int x = 2;
   cout << myFunc(x) << endl << myFunc(x) << endl << myFunc(x) << endl;
}

The answer, to me and all my colleagues, is obviously:

8
27
64

But now several students have pointed out that when they run this in certain environments they actually get the opposite:

64
27
8

When I run it in my linux environment using gcc I get what I would expect. Using MinGW on my Windows machine I get what they're talking about. It seems to be evaluating the last call to myFunc first, then the second call and then the first, then once it has all the results it outputs them in the normal order, starting with the first. But because the calls were made out of order the numbers are opposite.

It seems to me to be a compiler optimization, choosing to evaluate the function calls in the opposite order, but I don't really know why. My question is: are my assumptions correct? Is that what's going on in the background? Or is there something totally different? Also, I don't really understand why there would be a benefit to evaluating the functions backwards and then evaluating output forward. Output would have to be forward because of the way ostream works, but it seems like evaluation of the functions should be forward as well.

Thanks for your help!

+2  A: 

Yeah, the order of evaluation of functional arguments is "Unspecified" according to the Standards.

Hence the outputs differ on different platforms

Prasoon Saurav
similarly std::cout<<x<<++x<<x++; is Unspecified Behaviour.Note:It does not invoke Undefined Behaviour.
Prasoon Saurav
@Prasoon, no in that case, it *does* invoke undefined behavior, because the change of `x` is not bracketed by a sequence point to each other change of `x` anymore.
Johannes Schaub - litb
@litb: Nice observation!
ltcmelo
It is not UB because "<<" ia an overloaded operator(funtion call) and x,x++,++x are the arguments.In that case to be precise it is an Unspecified Behaviour because you cant predict which argument will be evaluated first. A funtion call(return) is indeed a sequence point.
Prasoon Saurav
It is undefined. Looking at the subexpressions, we have `x++`, `++x`, `std::cout<<x`, `(std::cout<<x)<<++x`, `that_thing<<x++`. So a valid order of execution is first to execute ++x, then x++, then the various calls to `operator<<`, which must be in order. In this valid order of execution there is no sequence point between x++ and ++x, therefore the whole expression has undefined behavior.
Steve Jessop
@Prasoon, it's both unspecified behavior and undefined behavior. But if you have both, then obviously you have undefined behavior. The order of evaluation of arguments to a function call is unspecified. But it's undefined to modify something twice without an intervening sequence point. (between two function calls in an expression, there is no sequence point either, but the standard notes that function calls may not interleave, which effectively sequences function calls, even if the order in the sequence is unspecified).
Johannes Schaub - litb
Sequence points have the downside that they don't inherently specify an order - they just splits an evaluation into two non-interleaved pieces. In C++0x, sequence points are gone, and instead we have then "sequences before" and "indeterminately sequenced" (which just means that they don't interleave, i think). Function call executions for example are "indeterminately sequenced", while evaluation of the left operand of the comma operator is "sequenced before" the right operand.
Johannes Schaub - litb
+13  A: 

The C++ standard does not define what order the subexpressions of a full expression are evaluated, except for certain operators which introduce an order (the comma operator, ternary operator, short-circuiting logical operators), and the fact that the expressions which make up the arguments/operands of a function/operator are all evaluated before the function/operator itself.

GCC is not obliged to explain to you (or me) why it wants to order them as it does. It might be a performance optimisation, it might be because the compiler code came out a few lines shorter and simpler that way, it might be because one of the mingw coders personally hates you, and wants to ensure that if you make assumptions that aren't guaranteed by the standard, your code goes wrong. Welcome to the world of open standards :-)

Edit to add: litb makes a point below about (un)defined behavior. The standard says that if you modify a variable multiple times in an expression, and if there exists a valid order of evaluation for that expression, such that the variable is modified multiple times without a sequence point in between, then the expression has undefined behavior. That doesn't apply here, because the variable is modified in the call to the function, and there's a sequence point at the start of any function call (even if the compiler inlines it). However, if you'd manually inlined the code:

std::cout << pow(x++,3) << endl << pow(x++,3) << endl << pow(x++,3) << endl;

Then that would be undefined behavior. In this code, it is valid for the compiler to evaluate all three "x++" subexpressions, then the three calls to pow, then start on the various calls to operator<<. Because this order is valid and has no sequence points separating the modification of x, the results are completely undefined. In your code snippet, only the order of execution is unspecified.

Steve Jessop
Thanks for the explanation, I figured it was an instance of an undefined spec, but now I can be sure of it. And it can stand as an example to the students of why side effects and overly complicated nesting are bad :)
Eric Lifka
+1. And in particular, note that the program doesn't exhibit undefined behavior. `x` is still only changed at most once between two sequence points, because function call executions cannot interleave each other. Its behavior is unspecified and after the cout expression statement, the value of `x` must be 5.
Johannes Schaub - litb
@onebyone: Very funny the "hate" part... :)
ltcmelo
+1  A: 

The order in which function call parameters is evaluated is unspecified. In short, you shouldn't use arguments that have side-effects that affect the meaning and result of the statement.

UncleBens
You'll also need to explain why (and how) `<<` is actually a function call - otherwise the answer may be hard to understand :)
Pavel Minaev
A: 

As has already been stated, you've wandered into the haunted forest of undefined behavior. To get what is expected every time you can either remove the side effects:

int myFunc(int &x) {
   int temp = x * x * x;
   return temp;
}

int main() {
   int x = 2;
   cout << myFunc(x) << endl << myFunc(x+1) << endl << myFunc(x+2) << endl;
   //Note that you can't use the increment operator (++) here.  It has
   //side-effects so it will have the same problem
}

or break the function calls up into separate statements:

int myFunc(int &x) {
   int temp = x * x * x;
   x += 1;
   return temp;
}

int main() {
   int x = 2;
   cout << myFunc(x) << endl;
   cout << myFunc(x) << endl;
   cout << myFunc(x) << endl;
}

The second version is probably better for a test, since it forces them to consider the side effects.

Graphics Noob
Yeah, I already mentioned to them that they could see the expected behavior by splitting up the output statement like that. I'd agree with not having a function with side effects, but as you noted the question is very specifically meant to test their knowledge of such side effects. That and as the TA I don't get to change the questions, just explain them :)
Eric Lifka
The first code "fix" won't lead to the expected behavior at all. The evaluation of the individual sub-expressions are still unspecified, and you have introduced an additional problem: You pass temporaries to a non-const reference.
Johannes Schaub - litb
Just a nit, that has been mentioned in other answers, but technically this is unspecified behavior not undefined behavior. Undefined behavior means that anything can happen, including formatting your harddrive, or starting up nethack. Unspecified means that it can evaluated the arguments in any order it likes, but they all must be evaluated before the function is called.
KeithB
@litb, the code most definitely will fix it because it makes the evaluation order irrelevant. The "fix" is making sure that the numbers are printed in the expected order, not controlling how the code gets compiled.
Graphics Noob
+7  A: 

Exactly why does this have unspecified behaviour.

When I first looked at this example I felt that the behaviour was well defined because this expression is actually short hand for a set of function calls.

Consider this more basic example:

cout << f1() << f2();

This is expanded to a sequence of function calls, where the kind of calls depend on the operators being members or non-members:

// Option 1:  Both are members
cout.operator<<(f1 ()).operator<< (f2 ());

// Option 2: Both are non members
operator<< ( operator<<(cout, f1 ()), f2 () );

// Option 3: First is a member, second non-member
operator<< ( cout.operator<<(f1 ()), f2 () );

// Option 4: First is a non-member, second is a member
cout.operator<<(f1 ()).operator<< (f2 ());

At the lowest level these will generate almost identical code so I will refer only to the first option from now.

There is a guarantee in the standard that the compiler must evaluate the arguments to each function call before the body of the function is entered. In this case, cout.operator<<(f1()) must be evaluated before operator<<(f2()) is, since the result of cout.operator<<(f1()) is required to call the other operator.

The unspecified behaviour kicks in because although the calls to the operators must be ordered there is no such requirement on their arguments. Therefore, the resulting order can be one of:

f2()
f1()
cout.operator<<(f1())
cout.operator<<(f1()).operator<<(f2());

Or:

f1()
f2()
cout.operator<<(f1())
cout.operator<<(f1()).operator<<(f2());

Or finally:

f1()
cout.operator<<(f1())
f2()
cout.operator<<(f1()).operator<<(f2());
Richard Corden
Nice :) I was about to write something similar, but it took me a while to check what exactly guarantees that the various `operator<<` calls are done in sequence. As you stated it makes sense to me, since then the implied object argument is just seen as a normal function argument which is evaluated before the function-entry sequence point. I thought that argument just existed for the purpose of overloading, but it seems that it has importance for this side-effect games too. +1 :)
Johannes Schaub - litb
A: 

And this is why, every time you write a function with a side-effect, God kills a kitten!

alex tingle