tags:

views:

171

answers:

4

I've got the following code:

#include <iostream>

using namespace std;

int main(int argc, char *argv[])
{
    string a = "a";
    for(unsigned int i=a.length()-1; i+1 >= 1; --i)
    {
     if(i >= a.length())
     {
      cerr << (signed int)i << "?" << endl;
      return 0;
     }
    }
}

If I compile in MSVC with full optimizations, the output I get is "-1?". If I compile in Debug mode (no optimizations), I get no output (expected.)

I thought the standard guaranteed that unsigned integers overflowed in a predictable way, so that when i = (unsigned int)(-1), i+1 = 0, and the loop condition i + 1 >= 1 fails. Instead, the test is somehow passing. Is this a compiler bug, or am I doing something undefined somewhere?

+8  A: 

I remember having this problem in 2001. I'm amazed it's still there. Yes, this is a compiler bug.

The optimiser is seeing

i + 1 >= 1;

Theoretically, we can optimise this by putting all of the constants on the same side:

i >= (1-1);

Because i is unsigned, it will always be greater than or equal to zero.

See this newsgroup discussion here.

Andrew Shepherd
A: 

I'm not certain, but I think you are probably running foul of a bug.

I suspect the trouble is in how the compiler is treating the for control. I could imagine the optimizer doing:

for(unsigned int i=a.length()-1; i+1 >= 1; --i)   // As written

for (unsigned int i = a.length()-1; i >= 0; --i) // Noting 1 appears twice

for (unsigned int i = a.length()-1; ; --i)   // Because i >= 0 at all times

Whether that is what is happening is another matter, but it might be enough to confuse the optimizer.

You would probably be better off using a more standard loop formulation:

for (unsigned i = a.length()-1; i-- > 0; )
Jonathan Leffler
+3  A: 

ISO14882:2003, section 5, paragraph 5:

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined, unless such an expression is a constant expression (5.19), in which case the program is ill-formed.

(Emphasis mine.) So, yes, the behavior is undefined. The standard makes no guarantees of behavior in the case of integer over/underflow.

Edit: The standard seems slightly conflicted on the matter elsewhere.

Section 3.9.1.4 says:

Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2 n where n is the number of bits in the value representation of that particular size of integer.

But section 4.7.2 and .3 says:

2) If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2 n where n is the number of bits used to represent the unsigned type). [Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]

3) If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

(Emphasis mine.)

greyfade
Hmm. Others (on different sites) are quoting section 4.7 of the standard:http://dev.feuvan.net/docs/isocpp/conv.htmlThey are using this to argue that it is defined.
Andrew Shepherd
From 3.9.1 4, and its footnote, I got the impression that unsigned integers are an exception: since 1 is added in the arithmetic of mod 2^n, the result can't be outside the range of values, right?
In conversion to a signed type (as in the case here), the result is implementation-defined. Because the value falls outside the range of *both* types in a signed operation, I take this to fall under section 5. The compiler is wrong here, either way.
greyfade
Looks like 5.5 to me too. (operator-- on an unsigned int with value 0). No conversion is possible there as far as I can see, so 4.7 doesn't apply.@greyfade: If we go with 5.5, how is the compiler wrong? It's undefined then, so by definition nothing it does can be wrong. Or did I miss something?
jalf
Assuming it's 5.5 that's relevant, I guess it comes down to "if the result is mathematically defined". As Etienne pointed out, overflow *is* defined for uints, but I'm not sure about underflow. Modulo arithmetics for negative numbers is implementation-defined in C++. So how does underflow work here?
jalf
The compiler is wrong because in permitting an undefined operation, there's no way it can be *right*. :) Semantics, I know. In any case, as I read the standard, the compiler should at least give a diagnostic, because the whole code snippet is ill-behaved. That the optimizer is buggy is a bonus.
greyfade
The compiler isn't supposed to prevent an undefined operation. That's the programmer's responsibility. The entire point in undefined behavior is that the compiler is free to do as it pleases. Why should it give a diagnostic?
jalf
Because this is potentially not what the programmer intended.
greyfade
But the compiler doesn't have any responsibility to point that out. It's just undefined behavior. Of course, it'd be nice and friendly to provide diagnostic, but it doesn't have to. (and the optimizer isn't buggy either, then. Once we're in UB, nothing the compiler does can be wrong)
jalf
I could be completely off-base. Of course, the compiler - not being wrong - could be too optimistic. Or, not being right, could be too pessimistic - like me.
greyfade
Hmm i thought 3.9.1 4 and the footnote there are pretty clear. It doesn't seem to define what modulo arithmetics mean, but i think it has a defined meaning in math: http://en.wikipedia.org/wiki/Modular_arithmetic , and it also has applications in 4.7/2 and 5.3.1/7. i don't think here is UB
Johannes Schaub - litb
Modulo arithmetics are unambiguous enough.
MSalters
unsigned has been used for ages for hash methods, because they have defined meaning for overflow/underflow (getting into range again) by 3.9.1 4
Johannes Schaub - litb
A: 

Yup, I just tested this on Visual Studio 2005, it definitely behaves differently in Debug and Release. I wonder if 2008 fixes it.

Interestingly it complained about your implicit cast from size_t (.length's result) to unsigned int, but has no problem generating bad code.

JZig