ansaurus

Question

Answer 1

+7 A:

int64_t s_val = SOME_SIGNED_VALUE;
uint64_t u_val = static_cast<uint64_t>(s_val);

C++ Standard 4.7/2 states that:

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2ⁿ where n is the number of bits used to represent the unsigned type). [Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]

From the other hand, Standard says that "The mapping performed by reinterpret_cast is implementation-defined. [Note: it might, or might not, produce a representation different from the original value. ]" (5.2.10/3). So, I'd recommend to use static_cast.

Kirill V. Lyadvinsky 2009-11-17 19:59:49

Answer 2

+2 A:

You can also reinterpret_cast it, or use a union:

union {
   int64_t i64;
   uint64_t ui64;
} variable;

variable.i64 = SOME_SIGNED_VALUE;
uint64_t a_copy = variable.ui64;

xtofl 2009-11-17 20:01:59

`static_cast` will not lead to losing bit pattern.

Kirill V. Lyadvinsky 2009-11-17 20:04:04

it doesn't. I wonder why not...

xtofl 2009-11-17 20:06:35

I was a bit worried about static_cast<>. I was also going to suggest using reinterpret_cast<> becuase of the bit pattern statement. Are you sure static_cast<> will work (I think it will but dont have a copy of the standard handy). But reinterpret_cast<> is also an indication that it is an unsafe cast.

Martin York 2009-11-17 20:07:39

@Martin: the unsafety is probably understood since he's concerned about the bits, not the value.

xtofl 2009-11-17 20:12:29

4.7/2 "<...>In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern". `static_cast` uses conversion from 4.7/2.

Kirill V. Lyadvinsky 2009-11-17 20:15:04

Confusion on this exact issue is why I like to cast references instead.

T.E.D. 2009-11-17 20:22:45

Answer 3

+3 A:

Generally speaking, it doesn't matter whether you use static_cast<int64_t> or reinterpret_cast<int64_t>. So long as you are running on a processor that uses two's complement to represent negative numbers, the result is the same. (Practically all modern processors use that.) Under two's complement, a positive number in a signed int is represented the same way in an unsigned int; if it's a negative number it'll be reinterpreted as a large positive number in the unsigned form.

Basically, what your cast does is tell the compiler to produce different assembly instructions when dealing with that value. E.g. there are different instructions for multiplication and division for signed integers. Although addition and subtraction remains the same (read the wikipedia link and you'll understand).

int3 2009-11-17 20:04:48

But what about a negative value?

Martin York 2009-11-17 20:08:49

The rule is "if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type", which works for 2's complement, and I think 1's complement, but I wouldn't swear that it kept the same bit pattern for more bizzaro representations of negative numbers.

Michael Burr 2009-11-17 20:13:08

The above is from the C99 standard, the C++ standard says this: "the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]"

Michael Burr 2009-11-17 20:15:42

I believe almost all processors use the two's complement. But I'll add it in.

int3 2009-11-17 20:19:38

There's no doubt that two's complement is by far the most likely representation used for negative numbers. I personally haven't worked on anything other than two's complement machines except for some Univac (sound like something from a scifi novel) thing back in University. It had 36-bit words to boot.

Michael Burr 2009-11-17 20:22:55

Unfortunately for me, until the standard says that integers are represented in 2's compliment the above re-assurance is not good enough. Historically too many programes have been broken by assumptions just like this one that are never again validated when the code is ported to another platform. Then tracing the problem the bug is next to impossable.

Martin York 2009-11-17 21:17:45

Answer 4

+2 A:

Logical bit pattern (bits of value-representation), i.e. values of binary digits can only be preserved if the original signed value was non-negative, because negative values cannot be represented by an unsigned integer variable. All you need to do is to assign your signed value to your unsigned integral object and you are done

uint64_t u_val = s_val;

An explicit cast is not necessary, but might be used to suppress compiler warnings.

As for physical bit pattern (i.e. what you see in raw memory, bits of object-representation), you simply can't "convert" it that way. C++ language does not provide you with any methods of conversion that would guarantee to preserve the physical bit pattern. All you can do is to reinterpret the memory occupied by the signed object as an unsigned object of the same size

STATIC_ASSERT(sizeof(int64_t) == sizeof(uint64_t));
uint64_t u_val = reinterpret_cast<uint64_t&>(s_val);

Again, this is not a conversion, but rather a memory reinterpretation. This is not guaranteed to work and this is generally illegal.

AndreyT 2009-11-17 20:15:39

Good point to distinguish "conversion" from "interpretation"!

xtofl 2009-11-17 20:18:09

How is it illegal? What do you mean by 'work'? You _can_ `reinterpret` back, can't you?

xtofl 2009-11-17 20:19:17

It is illegal becuse C++ language explcitly prohibits accessing memory occupied by an object of type `T` as an object of different type `U` (with few exceptions). In other words, reading reinterpreted memory is almost always illegal.

AndreyT 2009-11-17 20:21:36

yes, good point about "conversion" and "interpretation." Title of question now uses correct terminology.

bbg 2009-11-17 20:36:14

Answer 5

+1 A:

Note that you don't need the cast at all. For all the wrangling about whether the cat will munge bits or not for negative representations, one thing has gotten lost - the cat is completely unnecessary.

Because of the conversions that C/C++ will do (and how casting is defined), this:

int64_t s_val = SOME_SIGNED_VALUE;
uint64_t u_val = s_val;

is exactly equivalent to:

int64_t s_val = SOME_SIGNED_VALUE;
uint64_t u_val = static_cast<uint64_t>(s_val);

That said, you might still want the cast because it signals intent. However, I've heard it argued that you shouldn't use unnecessary casts because it can silence the compiler in situations where you might want a warning.

Pick your poison.

Michael Burr 2009-11-17 20:53:34

ansaurus

tags:

views:

answers:

interpret signed as unsigned

related questions