views:

196

answers:

3

Note: I noticed some errors in my posted example - editing to fix it

The official C# compiler does some interesting things if you don't enable optimization.

For example, a simple if statement:

int x;
// ... //
if (x == 10)
   // do something

becomes something like the following if optimized:

ldloc.0
ldc.i4.s 10
ceq
bne.un.s do_not_do_something
// do something
do_not_do_something:

but if we disable optimization, it becomes something like this:

ldloc.0
ldc.i4.s 10
ceq
ldc.i4.0
ceq
stloc.1
ldloc.1
brtrue.s do_not_do_something
// do something
do_not_do_something:

I can't quite get my head around this. Why all that extra code, which is seemingly not present in the source? In C#, this would be the equivalent of:

int x, y;
// ... //
y = x == 10;
if (y != 0)
   // do something

Does anyone know why it does this?

+1  A: 

For the specific answer you will need to wait for someone on the C# compiler team or someone close to that group to give a detailed explanation of this case.

However, this is generally just an artifact of the code generation, where common routines are written to handle many different cases for particular statement like if in your case.

This generalization results in functional but often less than optimal code in some cases. That is why the optimization passes exist to perfom various optimizations on the generated code to remove redundant code, loop unrolling, peep hole optimization, code sharing etc.

Other reasons for seeing less optimal code when compiling in debug mode is to support the debugger, for example a NOP instruction might be inserted into the code to facilitate a break point when running in the debugger, but removed for release builds.

Chris Taylor
Well, I can understand that the compiler is a lot less discriminating when it does not optimize, and thus generates sub-optimal code. But what I'm fishing for is rather what causes it do do this. The condition in an if-statement must return a boolean for the code to even compile, so why compare it with 0? I can't really think of a case where it wouldn't be redundant.
CPX
@CPX, Of course, I can only provide scenarios here since I am not familiar with the code gen backend of the C# compiler. But a simple thing like generating code to evaluate an expression and store the result, which can now be reused in the code generation of the `if` and `while` statements for example, albeit less optimal the code is reused and in fact this can help the optimizer since there is more consistency in the patterns and not minor variations due to copy/past/evolve so the compiler can find an optimize for these patterns. But it will be interesting to hear a vie from MS on this.
Chris Taylor
+2  A: 

I dont really see the issue, all the optimized code did was optimize a single referenced local away (stloc ldloc combo).

The reason it is present in the debug version is so you can see the value of the assignment to the local before using it.

Edit: I now see the other extra ceq.

Update 2:

I see what is happening. Due to booleans being represented as 0 and !0, the debug version does the second comparison. OTOH, the optimizer can probably prove something about the safety of code.

The unoptimized code would actually be like:

int x, _local; // _local is really bool

_local = (x == 10) == 0;  // ceq is ==, not <, not sure why you see that
if (_local)  // as in C, iow _local != 0 implied
{
  ...
}
leppie
*Ahem* Yes, of course, I meant x == 10, not x < 10.
CPX
I noticed some errors in the example I posted. Sorry about that.. It seems the difference is the following. Not optimized: ceq, ldc.i4.0, ceq, stloc.1, ldloc.1, brtrue.s, do_not_do_something Optimized: bne.un.s do_not_do_something This *sort of* makes sense. The debug version goes the long way around and inverts the value returned from *ceq*, then *branches* away if the inverted value is true. The optimized version simply calls *br.un.s* to branch directly if the value is false.
CPX
+13  A: 

I don't fully understand the point of the question. It sounds like you're asking "why does the compiler produce unoptimized code when the optimization switch is off?" which kinda answers itself.

However, I'll take a stab at it. I think the question is actually something like "what design decision causes the compiler to emit a declaration, store and load of local #1, which can be optimized away?"

The answer is because the unoptimized codegen is designed to be clear, unambiguous, easy to debug, and to encourage the jitter to generate code that does not aggressively collect garbage. One of the ways we achieve all those goals is to generate locals for most values that go on the stack, even temporary values. Let's take a look at a more complicated example. Suppose you have:

Foo(Bar(123), 456)

We could generate this as:

push 123
call Bar - this pops the 123 and pushes the result of Bar
push 456
call Foo

That is nice and efficient and small, but it does not meet our goals. It is clear and unambiguous, but it is not easy to debug because the garbage collector could get aggressive. If Foo for some reason does not actually do anything with its first argument then the GC is allowed to reclaim the return value of Bar before Foo runs.

In the unoptimized build we would generate something more like

push 123
call Bar - this pops the 123 and pushes the result of Bar
store the top of the stack in a temporary location - this pops the stack, and we need it back, so
push the value in the temporary location back onto the stack
push 456
call Foo

Now the jitter has a big hint that says "hey jitter, keep that alive in the local for a while even if Foo doesn't use it"

The general rule here is "make local variables out of all temporary values in the unoptimized build". And so there you go; in order to evaluate the "if" statement we need to evaluate a condition and convert it to bool. (Of course the condition need not be of type bool; it could be of a type implicitly convertible to bool, or a type that implements an operator true/operator false pair.) The unoptimized code generator has been told "aggressively turn all temporary values into locals", and so that's what you get.

I suppose in this case we could suppress that on temporaries that are conditions in "if" statements, but that sounds like making work for me that has no customer benefit. Since I have a stack of work as long as your arm that does have tangible customer benefit, I'm not going to change the unoptimized code generator, which generates unoptimized code, exactly as it is supposed to.

Eric Lippert