views:

569

answers:

6

Today I came a cross an article by Eric Lippert where he was trying to clear the myth between the operators precedence and the order of evaluation. At the end there were two code snippets that got me confused, here is the first snippet:

      int[] arr = {0};
      int value = arr[arr[0]++];

Now when I think about the value of the variable value, I simply calculate it to be one. Here's how I thought it's working.

  1. First declare arr as an array of int with one item inside of it; this item's value is 0.
  2. Second get the value of arr[0] --0 in this case.
  3. Third get the value of arr[the value of step 2] (which is still 0) --gets arr[0] again --still 0.
  4. Fourth assign the value of step 3 (0) to the variable value. --value = 0 now
  5. Add to the value of step 2 1 --Now arr[0] = 1.

Apparently this is wrong. I tried to search the c# specs for some explicit statement about when the increment is actually happening, but didn't find any.
The second snippet is from a comment of Eric's blog post on the topic:

 int[] data = { 11, 22, 33 }; 
 int i = 1;
 data[i++] = data[i] + 5;

Now here's how I think this program will execute --after declaring the array and assigning 1 to i. [plz bear with me]

  1. Get data[i] --1
  2. Add to the value of step 1 the value 5 --6
  3. Assign to data[i] (which is still 1) the value of step 2 --data[i] = 6
  4. Increment i -- i = 2

According to my understanding, this array now should contain the values {11, 27, 33}. However, when I looped to print the array values I got: {11, 38, 33}. This means that the post increment happened before dereferencing the array!
How come? Isn't this post increment supposed to be post? i.e. happen after everything else.
What am I missing guys?

+10  A: 

The postincrement operation occurs as part of evaluating the overall expression. It's a side effect which occurs after the value is evaluated but before any other expressions are evaluated.

In other words, for any expression E, E++ (if legal) represents something like (pseudo-code):

T tmp = E;
E += 1;
return tmp;

That's all part of evaluating E++, before anything else is evaluated.

See section 7.5.9 of the C# 3.0 spec for more details.


Additionally, for assignment operations where the LHS is classified as a variable (as in this case), the LHS is evaluated before the RHS is evaluated.

So in your example:

int[] data = { 11, 22, 33 }; 
int i = 1;
data[i++] = data[i] + 5;

is equivalent to:

int[] data = { 11, 22, 33 }; 
int i = 1;
// Work out what the LHS is going to mean...
int index = i;
i++;
// We're going to assign to data[index], i.e. data[1]. Now i=2.

// Now evaluate the RHS
int rhs = data[i] + 5; // rhs = data[2] + 5 == 38

// Now assign:
data[index] = rhs;

The relevant bit of the specification for this is section 7.16.1 (C# 3.0 spec).

Jon Skeet
That, and expressions (including assignments) are evaluated left to right except for when operator precedence dictates otherwise. So data[i++] (the left hand side of the assignment) is evaluated before data[i] on the right hand side.
LBushkin
@LBushkin: I think I was editing for exactly that purpose while you were commenting :)
Jon Skeet
@LBushkin Wrong. Microsoft explicitly says that both the assignment (=) and the ternary (?:) operators are resolved from right to left.
Leahn Novash
"The postincrement operation occurs immediately after the expression it postincrements is evaluated" I'd like to put it this way: The postincrement expression has a value like all expressions. It also has a side effect (unlike most trivial expressions). You can think of it as something like a method call. The value a method returns is totally independent from the side effects it might have. Basically, at the time of evaluation, a postincrement operation performs a side effect while returning the original value of the operand as its value.
Mehrdad Afshari
@Novash: How could ternary operators be resolved from right to left? The left-most expression has to be solved before deciding which of the other two expressions will be resolved (the other expression is not even executed).
jpbochi
Section 7.16.1 deals with boolean types. Didn't you mean 7.13.1, that deals with assignment? Also, section 7.5.9 that deals with post ++ specifies that the value is only changed after the set call occurs.
Leahn Novash
@jpbochi It is considered right associative when it is stacked.
Leahn Novash
@Leahn: I think you must be looking at a different version of the spec to me. I'm looking at the C# 3.0 spec, where 7.16.1 is very definitely assignment... and the LHS of the assignment is evaluated first.
Jon Skeet
@Mehrdad: I'll edit a little, but not quite to that extent. @Leahn: The version of the C# spec you linked to is definitely old - it doesn't cover query expressions, for example.
Jon Skeet
@Jon Skeet: I blame Microsoft. :) However, as I said in another post, I stand corrected. I misunderstood what I read. The correct resolution orders is: i = 1, find the address of data[1], add 1 to i, find the address of data[2], add 5 to value of data[2], assign result to data[1]
Leahn Novash
@Jon Skeet so for the first snippet, it will dereference arr[0] then store its value somewhere, then increment arr[0]. It would then use the stored temp value for evaluating the expression (the outer arr[0]) which will retrieve the incremented value. right, or am I missing something else?
Galilyou
@7alwagy: Yes, that's right (assuming I followed you correctly!)
Jon Skeet
Thanks for the awesome explanation Jon. Answer accepted
Galilyou
@Jon: There is no doubt that you are right. But why in the world, someone should write code like this? Where is the idea of readability or should I say understandability?
shahkalpesh
@shahkalpesh: Obviously these examples themselves are hideously unreadable - but I suspect there are other cases which look far more readable, but require the same details to be *fully* understood. These examples are good for demonstrating the order of evaluation.
Jon Skeet
@Jon: Thanks :)
shahkalpesh
+2  A: 
data[i++] // => data[1], then i is incremented to 2

data[1] = data[2] + 5 // => 33 + 5
Max
I disagree with you. The implementation says the assignment operator is lowest priority and resolved from right to left. The guy is correct on his assumptions, according to the language specs.http://msdn.microsoft.com/en-us/library/aa691323(VS.71).aspx
Leahn Novash
@Leahn Novash: You are again confusing associativity with evaluation order. "a = b = c" is right associative "a = (b = c)", but that says NOTHING about the evaluation order. The evaluation order in C# is ALWAYS left to right.
Daniel
I've read the specifications again more carefully. I stand corrected.
Leahn Novash
A: 

The cause might be that some compilers optimize i++ to be ++i. Most of the time, the end result is the same, but it seems to me to be one of those rare occasions when the compiler is wrong.

I have no access to Visual Studio right now to confirm this, but try disabling code optimization and see if the results will stay the same.

Leahn Novash
i++ and ++i are different and are used in different ways. Any compiler which converted i++ to ++i would invite the anger of many a developer.
NickAldwin
Fortunately, C# defines its behavior rather better than this.
Jon Skeet
I agree that compiler, gitter, or processor optimizations might take place, but this will happen if only the results of the optimizations are indistinguishable from the desired result on a single-threaded application: http://blogs.msdn.com/ericlippert/archive/2009/08/10/precedence-vs-order-redux.aspx
Galilyou
A: 

I would expect the post-increment operator to increment the variable after its value is used. In this case, the variable is incremented before the second reference to the variable.

If it would not be so, you could write

data[i++] = data[i++] + data[i++] + data[i++] + 5

If it would be like you say, then you could remove the increment operator because it doesn't do actually anything, in the instruction I reported.

+4  A: 

For the first snippet, the sequence is:

  1. Declare arr as you described:
  2. Retrieve the value of arr[0], which is 0
  3. Increment the value of arr[0] to 1.
  4. Retrieve the value of arr[(result of #2)] which is arr[0], which (per #3) is 1.
  5. Store that result in value.
  6. value = 1

For the second snippet, the evaluation is still left-to-right.

  1. Where are we storing the result? In data[i++], which is data[1], but now i = 2
  2. What are we adding? data[i] + 5, which is now data[2] + 5, which is 38.

The missing piece is that "post" doesn't mean "after EVERYTHING else." It just means "immediately after I retrieve the current value of that variable." A post increment happening "in the middle of" a line of code is completely normal.

VoteyDisciple
A: 

You have to think of assignments in three steps:

  1. Evaluate left hand side (=get address where the value should be stored)
  2. Evaluate right hand side
  3. Assign the value from step 2 to the memory location from step 1.

If you have something like

A().B = C()

Then A() will run first, then C() will run, and then the property setter B will run.

Essentially, you have to think of your statement as

StoreInArray(data, i++, data[i] + 5);
Daniel