views:

372

answers:

7

Why is this example of code behaving differently in c++ and C#.

[C++ Example]

int arr[2];
int index = 0;
arr[index] = ++index;

The result of which will be arr[1] = 1;

[C# Example]

int[] arr = new int[2];
int index = 0;
arr[index] = ++index;

The result of which will be arr[0] = 1;

I find this very strange. Surely there must be some rationale for both languages to implement it differently? I wonder what would C++/CLI output?

+3  A: 

The behaviour of using index and ++index inside the same assignment is unspecified in C++. You just can't just do that: write arr[index] = index + 1 and increment your variable after that. For that matter, with my C++ compiler on my machine I see arr[0] = 1, and arr[1] is untouched.

Arthur Reutenauer
It's worse than unspecified, it's actually undefined behaviour.
Charles Bailey
Of course, that's what I should have written (it's *specified* as *undefined* ;-) Can't keep a straight mind when you spend a whole morning debugging... (I even found a bug in gdb!)
Arthur Reutenauer
+4  A: 

Your C++ code could, in fact, do anything. arr[index] = ++index; invokes undefined behaviour.

Paul Baker
A: 

The result of the C++ version will not always be as you write as you are invoking undefined behaviour. In C++ you will get undefined behaviour if you use the value of a variable in an expression when that variable is also modified the same expression unless reading that value is part of determining the value to be written, or the expression contains a sequence point between the read and the write.

In your expression, you are reading the value of index to determine where to assign the result of the right hand side of the =, but the right hand sub-expression also modifies index.

Charles Bailey
+1  A: 

In the case of C++, at least, you're invoking undefined behavior by preincrementing and using index without a sequence point in between. If you feed that code to GCC with warnings enabled it will say:

preinc.cpp:6: warning: operation on ‘index’ may be undefined

I'm guessing that it's undefined as well in C#, but I don't know the language. For C and C++ at the very least though, the answer is that the compiler can do anything it wants without being wrong because your code is erroneous. There's no obligation for different compilers (or even the same compiler) to produce consistent results, either.

hobbs
Just as a curiosity, G++ 4.4.1 behaves as C#, setting array[0] = 1.
hobbs
_Undefined behavior_ means it could set `array[1]=1` or set `array[4711]=42`, wipe your hard disk, or turn you bald, depending on whether a debugger is present, it's run the 42nd time, depending on the phase of the moon, how much your girlfriend likes you, or whether you have been nasty to your mum. It makes no sense to say "compiler x, version y, does it _this_ way", because you never know. (http://stackoverflow.com/questions/1553382/1553407#1553407)
sbi
It is *STRICTLY DEFINED* in C#. We do not go in for this crazy business of having simple expressions that have no meaning but the compiler takes them anyways and does some crazy thing. Give us some credit here!
Eric Lippert
'Crazy business'? Eric will probably revisit this when the meaning of 'undefined behaviour' clicks. The problem is that you solved one issue, and introduced another one, and by simply calling 30y backward compatibility as the culprit of 'crazy' act; this makes it even worse than adopting C-style family syntax in a 'new' language..C# and its compiler are so powerful and new, it is starting to look like C++ without pointers all the way through, and with each iteration. Please look for the credit where it is due: Compiler that compiles your C-style Java copycat VM and C# compiler.
rama-jka toti
+1  A: 

Note: Acording to @Eric Lippert's answer, the behavior is strictly defined for C#, so let me reword my answer on this.

This code:

arr[index] = ++index;

Is hard to read even if the C# compiler knows exactly how to evaluate it and in which order. For this reason alone it should be avoided.

The MSDN page on C# Operators goes so far as pointing out that this behaviour might be undefined, even though Eric points out it is not. The fact that multiple sources of documentation (I'll trust Eric on this however) gets it different is also a tell that this might be something best left alone.

Lasse V. Karlsen
It is NOT UNDEFINED in C#, it is strictly defined. See above.
Eric Lippert
Perhaps someone should go and update the MSDN documentation then?
Lasse V. Karlsen
Holy goodness, that page is CHOCK FULL of errors. I'll have a talk with the documentation manager immediately. Thanks for bringing that to my attention.
Eric Lippert
I'd suggest a C# language design meeting where 'undefined behaviour' is cleared up to be a well understood term.
rama-jka toti
Undefined behaviour is actually a very well defined term. It doesn't mean "random" or "different every time", rather it means that whatever you observe the code to be doing is not something you should rely on it to do in the future. For instance, moving the code around or changing optimization parameters might change the behaviour.
Lasse V. Karlsen
A: 

index in C# is a value type, which means you return a new instance of the value when you perform operations on it.

If you imagine it as a procedure instead of an operator, the procedure would look like this:

public int Increment(int value)
{
   int returnValue=value+1;
   return returnValue;
}

C++, however, works on the reference of the object, so the procedure would look like:

int Increment(int &value)
{
   value=value+1;
   return value;
}

Note: if you had been applying the operator on an object (say overloaded the ++ operator) then C# would behave like C++, since object types are passed as references.

Law Metzler
Are you saying in C# ++index does not modify index?
Henrik
No, I was just sleep deprived as I was trying to explain and not thinking right
Law Metzler
+9  A: 

As others have noted, the behaviour of this code is undefined in C/C++. You can get any result whatsoever.

The behaviour of your C# code is strictly defined by the C# standard.

Surely there must be some rationale for both languages to implement it differently?

Well, suppose you were designing C#, and wished to make the language easy for C++ programmers to learn. Would you choose to copy C++'s approach to this problem, namely, leave it undefined? Do you really want to make it easy for perfectly intelligent developers to accidentally write code that the compiler can just make up any meaning for that it wants?

The designers of C# do not believe that undefined behaviour of simple expressions is a good thing, and therefore we have strictly defined what expressions like this mean. We cannot possibly agree with what every C++ compiler does because different C++ compilers give you different results for this sort of code, and so we cannot agree with all of them.

As for why the designers of C++ believe that it is better to leave simple expressions like this to have undefined behaviour, well, you'll have to ask one of them. I could certainly make some conjectures, but those would just be educated guesses.

I've written a number of blog articles about this sort of issue; my most recent one was about almost exactly the code you mention here. Some articles you might want to read:

How the design of C# encourages elimination of subtle bugs:

http://blogs.msdn.com/ericlippert/archive/2007/08/14/c-and-the-pit-of-despair.aspx

Exactly what is the relationship between precedence, associativity, and order of execution in C#?

http://blogs.msdn.com/ericlippert/archive/2008/05/23/precedence-vs-associativity-vs-order.aspx

In what order do the side effects of indexing, assignment and increment happen?

http://blogs.msdn.com/ericlippert/archive/2009/08/10/precedence-vs-order-redux.aspx

Eric Lippert
Thanks. I'll read the articles.
Ivan Zlatanov
It is undefined in C++ because it is also undefined in C. Apparently, different C compilers did different things before the standard was adopted and leaving it undefined was the only sane option, IMHO.
Nemanja Trifunovic