views:

511

answers:

3

See this code snippet

int main()
{ 
 unsigned int a = 1000;
 int b = -1;
 if (a>b) printf("A is BIG! %d\n", a-b);
 else printf("a is SMALL! %d\n", a-b); 
 return 0;
}   

This gives the output: a is SMALL: 1001

I don't understand what's happening here. How does the > operator work here? Why is "a" smaller than "b"? If it is indeed smaller, why do i get a positive number (1001) as the difference?

Kindly explain.

A: 

You are doing unsigned comparison, i.e. comparing 1000 to 2^32 - 1.

The output is signed because of %d in printf.

N.B. sometimes the behavior when you mix signed and unsigned operands is compiler-specific. I think it's best to avoid them and do casts when in doubt.

antti.huima
The subtraction doesn't make anything signed. Subtraction is the same operation for both signed and unsigned values.
wj32
Incorrect, subtraction between `int` and `unsigned int` operands is evaluated as unsigned subtraction and the result is, of course, unsigned.
AndreyT
AndreyT is right, the real problem is the %d in printf
antti.huima
+3  A: 

-1 when converted to an unsigned int is 4,294,967,295 which is indeed ≥ 1000.

Even if you treat the subtraction in an unsigned world, 1000 - (4,294,967,295) = -4,294,966,295 = 1,001 which is what you get.

That's why gcc will spit a warning when you compare unsigned with signed. (If you don't see a warning, pass the -Wsign-compare flag.)

KennyTM
Thanks for the answer.. that made it clear
Gitmo
I downvoted because of "4,294,967,295 (2's complement)". It has nothing to do with 2's complement. It will yield the same value on a 1's complement machine. And will yield a different value on a different bitwidth integer.
Johannes Schaub - litb
@Schaub: Maybe I'm not clear but what I mean is 4,294,967,295 (which is 2's complement of 1) is indeed ≥1. Also, the the 1's complement machine the representation of -1 is 4,294,967,294.
KennyTM
@KennyTM: as litb says, it has nothing to do with the representation. On a 1s' complement machine, converting -1 to unsigned results in UINT_MAX, it doesn't result in the 1s' complement bit pattern being reinterpreted. One of the several ways in which 2's complement is convenient, is that C (un)signed conversions don't change the bit pattern. That's particular to 2's complement: the C conversions to unsigned types are defined in terms of modulo arithmetic, not in terms of bit pattern. On 1s' complement, the implementation has to do some actual work to come up with UINT_MAX.
Steve Jessop
@Kenny, the edit is better, but still there is no guarantee that `UINT_MAX` is 4,294,967,295. Also see http://stackoverflow.com/questions/1863153
Alok
+4  A: 

Binary operations between different integral types are performed within a "common" type defined by so called usual arithmetic conversions (see the language specification, 6.3.1.8). In your case the "common" type is unsigned int. This means that int operand (your b) will get converted to unsigned int before the comparison, as well as for the purpose of performing subtraction.

When -1 is converted to unsigned int the result is the maximal possible unsigned int value (same as UINT_MAX). Needless to say, it is going to be greater than your unsigned 1000 value, meaning that a > b is indeed false and a is indeed small compared to (unsigned) b. The if in your code should resolve to else branch, which is what you observed in your experiment.

The same conversion rules apply to subtraction. Your a-b is really interpreted as a - (unsigned) b and the result has type unsigned int. Such value cannot be printed with %d format specifier, since %d only works with signed values. Your attempt to print it with %d results in undefined behavior, so the value that you see printed (even though it has a logical deterministic explanation in practice) is completely meaningless from the point of view of C language.

Edit: Actually, I could be wrong about the undefined behavior part. According to C language specification, the common part of the range of the corresponding signed and unsigned integer type shall have identical representation (implying, according to the footnote 31, "interchangeability as arguments to functions"). So, the result of a - b expression is unsigned 1001 as described above, and unless I'm missing something, it is legal to print this specific unsigned value with %d specifier, since it falls withing the positive range of int. Printing (unsigned) INT_MAX + 1 with %d would be undefined, but 1001u is fine.

AndreyT
Although we know or can guess enough about the calling convention of his implementation to conclude that what has happened is the unsigned result of a-b, which is in fact 1001, has been passed through the varargs unscathed, and reinterpreted as signed without changing the value.
Steve Jessop
Yeah passing unsigned int and doing `va_arg(ap, int)` alone is not UB yet. But it's indeed UB to violate printf's requirements on expecting an `int`. It sounds silly to me, though. Why haven't they specified for printf: "The type of the next argument shall be a signed or unsigned int, and shall be within range of int".
Johannes Schaub - litb
@Johannes: Actually, it might be already specified. See my edit.
AndreyT
But whether it has compatible representation is not important, i think. If the Standard says something is UB, then it is UB: It's not superceeded by some statement in a footnote. I think this just means that there is some range of freedome you can do, without hard requirements on part of the standard. Like with reading from different union members (this is not UB up front in C, i think, but can be well defined if both members have compatible representations). But calling some function through a prototype-less function pointer expression with incompatible arguments stays UB, for instance.
Johannes Schaub - litb
In this case, `fprintf` description states: (for %d): "The int argument is converted to ..." and "If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.". So i don't believe it's well defined. Maybe someone from usenet knows?
Johannes Schaub - litb
I can't figure it out from 6.2.6.2, but I think the standard only compares the value bits of corresponding signed and unsigned integral types, not any padding bits. Is it legal for an implementation to ignore the padding bits in unsigned int, but for any padding bit set in int to be a trap representation? I don't know, but if so then the 1001 generated might just happen to have a padding bit set, and therefore reinterpreting it as int would be U.B. Not to say necessarily that there even are any gcc targets with padding bits in int, let alone ones with this odd property...
Steve Jessop