views:

195

answers:

4

Hi All,

My questions are divided into three parts

Question 1
Consider the below code,

#include <iostream>
using namespace std;

int main( int argc, char *argv[])
{

    const int v = 50;
    int i = 0X7FFFFFFF;

    cout<<(i + v)<<endl;

    if ( i + v < i )
    {
        cout<<"Number is negative"<<endl;
    }
    else
    {
        cout<<"Number is positive"<<endl;
    }

    return 0;
}

No specific compiler optimisation options are used or the O's flag is used. It is basic compilation command g++ -o test main.cpp is used to form the executable.

The seemingly very simple code, has odd behaviour in SUSE 64 bit OS, gcc version 4.1.2. The expected output is "Number is negative", instead only in SUSE 64 bit OS, the output would be "Number is positive".

After some amount of analysis and doing a 'disass' of the code, I find that the compiler optimises in the below format -

  • Since i is same on both sides of comparison, it cannot be changed in the same expression, remove 'i' from the equation.
  • Now, the comparison leads to if ( v < 0 ), where v is a constant positive, So during compilation itself, the else part cout function address is added to the register. No cmp/jmp instructions can be found.

I see that the behaviour is only in gcc 4.1.2 SUSE 10. When tried in AIX 5.1/5.3 and HP IA64, the result is as expected.

Is the above optimisation valid?
Or, is using the overflow mechanism for int not a valid use case?

Question 2
Now when I change the conditional statement from if (i + v < i) to if ( (i + v) < i ) even then, the behaviour is same, this atleast I would personally disagree, since additional braces are provided, I expect the compiler to create a temporary built-in type variable and them compare, thus nullify the optimisation.

Question 3
Suppose I have a huge code base, an I migrate my compiler version, such bug/optimisation can cause havoc in my system behaviour. Ofcourse from business perspective, it is very ineffective to test all lines of code again just because of compiler upgradation.

I think for all practical purpose, these kinds of error are very difficult to catch (during upgradation) and invariably will be leaked to production site.

Can anyone suggest any possible way to ensure to ensure that these kind of bug/optimization does not have any impact on my existing system/code base?


PS :

  • When the const for v is removed from the code, then optimization is not done by the compiler.
  • I believe, it is perfectly fine to use overflow mechanism to find if the variable is from MAX - 50 value (in my case).

Update(1)
What would I want to achieve? variable i would be a counter (kind of syncID). If I do offline operation (50 operation) then during startup, I would like to reset my counter, For this I am checking the boundary value (to reset it) rather than adding it blindly.

I am not sure if I am relying on the hardware implementation. I know that 0X7FFFFFFF is the max positive value. All I am doing is, by adding value to this, I am expecting the return value to be negative. I don't think this logic has anything to do with hardware implementation.

Anyways, all thanks for your input.


Update(2)
Most of the inpit states that I am relying on the lower level behavior on overflow checking. I have one questions regarding the same,

  • If that is the case, For an unsigned int how do I validate and reset the value during underflow or overflow? like if v=10, i=0X7FFFFFFE, I want reset i = 9. Similarly for underflow?

I would not be able to do that unless I check for negativity of the number. So my claim is that int must return a negative number when a value is added to the +MAX_INT.

Please let me know your inputs.

+9  A: 

It's a known problem, and I don't think it's considered a bug in the compiler. When I compile with gcc 4.5 with -Wall -O2 it warns

warning: assuming signed overflow does not occur when assuming that (X + c) < X is always false

Although your code does overflow.

You can pass the -fno-strict-overflow flag to turn that particular optimization off.

KennyTM
+1. To emphasize, GCC's behavior is conforming because the Standard says that overflow in such operation causes undefined behavior.
Johannes Schaub - litb
"assuming signed overflow" - indeed. In what standard is it prescribed, that (0x7fffffff + 50) must overflow?
Ingo
The standard does not require 0x7fffffff+50 to overflow. However, it allows it to overflow, and if it does, then the standard allows undefined behavior.
Keith Randall
-fno-strict-overflow is not present in all gcc version. Anyways by far this is the best answer and option I get. Also, I think my assumption of INT_MAX + x will be negative is valid (I am not relying on machine implementation). Please let me know your thougts on this.
kumar_m_kiran
@kumar-m-kiran: your assumption is not valid. The C++ standard says that signed integer overflow is undefined behavior (5/5, and 3.9.1/4 where it is stated that *unsigned* types wraparound. By omission signed types are not required to do so). What you say is true in Java, but not in C++.
Steve Jessop
@SteveJessop, If that is the case, For an unsigned int how do I validate and reset the value during underflow or overflow? like if v=10, i=0X7FFFFFFE, I want reset i = 9. Similarly for underflow.
kumar_m_kiran
You mean something like `if(INT_MAX-v < i) i=9;`?
jpalecek
@kumar_m_kiran: if you just want modular arithmetic, do the addition using `unsigned int` instead of `int` (and for portable code be careful converting back to signed, although this is no problem in the implementation you're currently using), or use the compiler flag KennyTM mentions, to make signed overflow wrap. If you want to check, use jpalacek's example, and of course make sure `v` isn't negative.
Steve Jessop
+1  A: 

What does the line:

cout<<(i + v)<<endl;

Output in the SUSE example? You're sure you don't have 64bit ints?

harald
It is telling, that the "bug" occurs on a 64-bit implementation.
Ingo
+2  A: 

Q1: Perhaps, the number is indeed positive in a 64bit implementation? Who knows? Before debugging the code I'd just printf("%d", i+v);

Q2: The parentheses are only there to tell the compiler how to parse an expression. This is usually done in the form of a tree, so the optimizer does not see any parentheses at all. And it is free to transform the expression.

Q3: That's why, as c/c++ programmer, you must not write code that assumes particular properties of the underlying hardware, such as, for example, that an int is a 32 bit quantity in two's complement form.

Ingo
+3  A: 

Your code produces undefined behavior. C and C++ languages has no "overflow mechanism" for signed integer arithmetic. Your calculations overflow signed integers - the behavior is immediately undefined. Considering it form "a bug in the compiler or not" position is no different that attempting to analyze the i = i++ + ++i examples.

GCC compiler has an optimization based on that part of the specification of C/C++ languages. It is called "strict overflow semantics" or something lake that. It is based on the fact that adding a positive value to a signed integer in C++ always produces a larger value or results in undefined behavior. This immediately means that the compiler is perfectly free to assume that the sum is always larger. The general nature of that optimization is very similar to the "strict aliasing" optimizations also present in GCC. They both resulted in some complaints from the more "hackerish" parts of GCC user community, many of whom didn't even suspect that the tricks they were relying on in their C/C++ programs were simply illegal hacks.

AndreyT