tags:

views:

6679

answers:

7

Suppose I have the following C code:

unsigned int u = 1234;
int i = -5678;

unsigned int result = u + i;

What implicit conversions are going on here, and is this code safe for all values of u and i? (safe, in the sense that even though result in this example will overflow to some huge positive number, I could cast it back to an int and get the real result)

+13  A: 

Short Answer

Your i will be converted to an unsigned integer by adding UINT_MAX + 1, then the addition will be carried out with the unsigned values, resulting in a large result (depending on the values of u and i).

Long Answer

According to the C99 Standard:

6.3.1.8 Usual arithmetic conversions

  1. If both operands have the same type, then no further conversion is needed.
  2. Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
  3. Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
  4. Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
  5. Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

In your case, we have one unsigned int (u) and signed int (i). Referring to (3) above, since both operands have the same rank, your i will need to be converted to an unsigned integer.

6.3.1.3 Signed and unsigned integers

  1. When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
  2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
  3. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Now we need to refer to (2) above. Your i will be converted to an unsigned value by adding UINT_MAX + 1. So the result will depend on how UINT_MAX is defined on your implementation. It will be large, but it will not overflow, because:

6.2.5 (9)

A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

Bonus: Arithmetic Conversion Semi-WTF

#include <stdio.h>

int main(void)
{
    unsigned int u = 1;
    int i = -1;

    if(u < i)
        printf("1 < -1");
    else
        printf("boring");

    return 0;
}

You can use this link to try this online: http://codepad.org/LfT4koM1

Bonus: Arithmetic Conversion Side Effect

Arithmetic conversion rules can be used to get the value of UINT_MAX by initializing an unsigned value to -1, ie:

unsigned int umax = -1; // All bits of umax set to 1

This is guaranteed to be portable regardless of the signed number representation of the system because of the conversion rules described above. See this SO question for more information: Is it safe to use -1 to set all bits to true?

Ozgur Ozcitak
Whoa there. It's well-defined to go from signed to unsigned, but going from unsigned to signed is implementation-defined.
rlbond
This isn't correct. From a language standpoint Integer conversion from `int` to `unsigned int` has everything to do with the value of the source object and nothing (conceptually) to with its internal representation. The value is converted using modulo 2^N arithmetic where N is the number of value bits in an `unsigned int` whatever representation the implementation uses for `int`.
Charles Bailey
This answer is simply wrong. It's explaining how common implementations work, not how the language works.
R..
+3  A: 

When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.

So it is safe in the sense of that the result might be huge and wrong, but it will never crash.

Mats Fredriksson
A: 

When converting from signed to unsigned there are two possibilities. Numbers that were originally positive remain (or are interpreted as) the same value. Number that were originally negative will now be interpreted as larger positive numbers.

Tim Ring
A: 

As was previously answered, you can cast back and forth between signed and unsigned without a problem. The border case for signed integers is -1 (0xFFFFFFFF). Try adding and subtracting from that and you'll find that you can cast back and have it be correct.

However, if you are going to be casting back and forth, I would strongly advise naming your variables such that it is clear what type they are, eg:

int iValue, iResult;
unsigned int uValue, uResult;

It is far too easy to get distracted by more important issues and forget which variable is what type if they are named without a hint. You don't want to cast to an unsigned and then use that as an array index.

Taylor Price
+2  A: 

Referring to the bible:

  • Your addition operation causes the int to be converted to an unsigned int.
  • Assuming two's complement representation and equally sized types, the bit pattern does not change.
  • Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)
  • The rules are a little more complicated in the case of combining signed and unsigned of differing sizes.
smh
A: 

Conversion from signed to unsigned does not necessarily just copy or reinterpret the representation of the signed value. Quoting the C standard (C99 6.3.1.3):

When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

For the two's complement representation that's nearly universal these days, the rules do correspond to reinterpreting the bits. But for other representations (sign-and-magnitude or ones' complement), the C implementation must still arrange for the same result, which means that the conversion can't just copy the bits. For example, (unsigned)-1 == UINT_MAX, regardless of the representation.

In general, conversions in C are defined to operate on values, not on representations.

To answer the original question:

unsigned int u = 1234;
int i = -5678;

unsigned int result = u + i;

The value of i is converted to unsigned int, yielding UINT_MAX + 1 - 5678. This value is then added to the unsigned value 1234, yielding UINT_MAX + 1 - 4444.

(Unlike unsigned overflow, signed overflow invokes undefined behavior. Wraparound is common, but is not guaranteed by the C standard -- and compiler optimizations can wreak havoc on code that makes unwarranted assumptions.)

A: 

Horrible Answers Galore

Ozgur Ozcitak

When you cast from signed to unsigned (and vice versa) the internal representation of the number does not change. What changes is how the compiler interprets the sign bit.

This is completely wrong.

Mats Fredriksson

When one unsigned and one signed variable are added (or any binary operation) both are implicitly converted to unsigned, which would in this case result in a huge result.

This is also wrong. Unsigned ints may be promoted to ints should they have equal precision due to padding bits in the unsigned type.

smh

Your addition operation causes the int to be converted to an unsigned int.

Wrong. Maybe it does and maybe it doesn't.

Conversion from unsigned int to signed int is implementation dependent. (But it probably works the way you expect on most platforms these days.)

Wrong. It is either undefined behavior if it causes overflow or the value is preserved.

Anonymous

The value of i is converted to unsigned int ...

Wrong. Depends on the precision of an int relative to an unsigned int.

Taylor Price

As was previously answered, you can cast back and forth between signed and unsigned without a problem.

Wrong. Trying to store a value outside the range of a signed integer results in undefined behavior.

Now I can finally answer the question.

Should the precision of int be equal to unsigned int, u will be promoted to a signed int and you will get the value -4444 from the expression (u+i). Now, should u and i have other values, you may get overflow and undefined behavior but with those exact numbers you will get -4444 [1]. This value will have type int. But you are trying to store that value into an unsigned int so that will then be cast to an unsigned int and the value that result will end up having would be (UINT_MAX+1) - 4444.

Should the precision of unsigned int be greater than that of an int, the signed int will be promoted to an unsigned int yielding the value (UINT_MAX+1) - 5678 which will be added to the other unsigned int 1234. Should u and i have other values, which make the expression fall outside the range {0..UINT_MAX} the value (UINT_MAX+1) will either be added or subtracted until the result DOES fall inside the range {0..UINT_MAX) and no undefined behavior will occur.

What is precision?

Integers have padding bits, sign bits, and value bits. Unsigned integers do not have a sign bit obviously. Unsigned char is further guaranteed to not have padding bits. The number of values bits an integer has is how much precision it has.

[Gotchas]

The macro sizeof macro alone cannot be used to determine precision of an integer if padding bits are present. And the size of a byte does not have to be an octet (eight bits) as defined by C99.

[1] The overflow may occur at one of two points. Either before the addition (during promotion) - when you have an unsigned int which is too large to fit inside an int. The overflow may also occur after the addition even if the unsigned int was within the range of an int, after the addition the result may still overflow.


On an unrelated note, I am a recent graduate student trying to find work ;)

Elite Mx
"Unsigned ints may be promoted to ints". Not true. No integer _promotion_ occurs as the types are already rank >= int. 6.3.1.1: "The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any." and 6.3.1.8: "Otherwise, if the operand that has unsigned integer type has rank greater **or equal** to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type." both guarantee that `int` is converted to `unsigned int` when the usual arithmetic conversions apply.
Charles Bailey
6.3.1.8 Occurs only after integer promotion. Opening paragraph says "Otherwise, the integer promotions are performed on both operands. THEN the following rules are applied to the promoted operands". So go read the promotion rules 6.3.1.1 ... "An object or expression with an integer type whose integer conversion rank is lessthan or EQUAL to the rank of int and unsigned int" and "If an int can represent all values of the original type, the value is converted to an int".
Elite Mx
6.3.1.1 Integer promotion used used to convert some integer types that aren't `int` or `unsigned int` to one of those types where something of type `unsigned int` or `int` is expected. The "or equal" was added in TC2 to allow enumerated types of conversion rank equal to `int` or `unsigned int` to be converted to one of those types. It was never intended that the promotion described would convert between `unsigned int` and `int`. The common type determination between `unsigned int` and `int` is still governed by 6.3.1.8, even post TC2.
Charles Bailey
6.3.1.8 is after promotion. So if you are reading 6.3.1.8 promotion either happened or did not. 6.3.1.8 does not help you determine whether promotion happens. It is after the fact. 6.3.1.1 talks about promotion and I have already quoted the proper text. Not only that but promotion has one of two types (unsigned or value preserving). C99 is about value perservation. So you if you have two unsigned ints, with values x = 5 and y = 6 respectively, it makes sense that if a signed int could represent all the values of an unsigned int x - y would result in -1. Please quote C99 to make your argument.
Elite Mx
Posting wrong answers while criticizing others' wrong answers doesn't sound like a good strategy for getting work... ;-)
R..
A horrible answer in itself. Too many errors. When the conversion from `unsigned int` to `int` overflows, there's no undefined behavior. The result is either implementation-defined or an implementation-defined signal is raised.
AndreyT