tags:

views:

299

answers:

5

I have a value like this:

int64_t s_val = SOME_SIGNED_VALUE;

How can I get a

uint64_t u_val

that has exactly the same bit pattern as s_val, but is treated as unsigned?

This may be really simple, but after looking on Stackoverflow and elsewhere I haven't turned up the answer.

+7  A: 
int64_t s_val = SOME_SIGNED_VALUE;
uint64_t u_val = static_cast<uint64_t>(s_val);

C++ Standard 4.7/2 states that:

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]

From the other hand, Standard says that "The mapping performed by reinterpret_cast is implementation-defined. [Note: it might, or might not, produce a representation different from the original value. ]" (5.2.10/3). So, I'd recommend to use static_cast.

Kirill V. Lyadvinsky
+2  A: 

You can also reinterpret_cast it, or use a union:

union {
   int64_t i64;
   uint64_t ui64;
} variable;

variable.i64 = SOME_SIGNED_VALUE;
uint64_t a_copy = variable.ui64;
xtofl
`static_cast` will not lead to losing bit pattern.
Kirill V. Lyadvinsky
it doesn't. I wonder why not...
xtofl
I was a bit worried about static_cast<>. I was also going to suggest using reinterpret_cast<> becuase of the bit pattern statement. Are you sure static_cast<> will work (I think it will but dont have a copy of the standard handy). But reinterpret_cast<> is also an indication that it is an unsafe cast.
Martin York
@Martin: the unsafety is probably understood since he's concerned about the bits, not the value.
xtofl
4.7/2 "<...>In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern". `static_cast` uses conversion from 4.7/2.
Kirill V. Lyadvinsky
Confusion on this exact issue is why I like to cast references instead.
T.E.D.
+3  A: 

Generally speaking, it doesn't matter whether you use static_cast<int64_t> or reinterpret_cast<int64_t>. So long as you are running on a processor that uses two's complement to represent negative numbers, the result is the same. (Practically all modern processors use that.) Under two's complement, a positive number in a signed int is represented the same way in an unsigned int; if it's a negative number it'll be reinterpreted as a large positive number in the unsigned form.

Basically, what your cast does is tell the compiler to produce different assembly instructions when dealing with that value. E.g. there are different instructions for multiplication and division for signed integers. Although addition and subtraction remains the same (read the wikipedia link and you'll understand).

int3
But what about a negative value?
Martin York
The rule is "if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type", which works for 2's complement, and I think 1's complement, but I wouldn't swear that it kept the same bit pattern for more bizzaro representations of negative numbers.
Michael Burr
The above is from the C99 standard, the C++ standard says this: "the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]"
Michael Burr
I believe almost all processors use the two's complement. But I'll add it in.
int3
There's no doubt that two's complement is by far the most likely representation used for negative numbers. I personally haven't worked on anything other than two's complement machines except for some Univac (sound like something from a scifi novel) thing back in University. It had 36-bit words to boot.
Michael Burr
Unfortunately for me, until the standard says that integers are represented in 2's compliment the above re-assurance is not good enough. Historically too many programes have been broken by assumptions just like this one that are never again validated when the code is ported to another platform. Then tracing the problem the bug is next to impossable.
Martin York
+2  A: 

Logical bit pattern (bits of value-representation), i.e. values of binary digits can only be preserved if the original signed value was non-negative, because negative values cannot be represented by an unsigned integer variable. All you need to do is to assign your signed value to your unsigned integral object and you are done

uint64_t u_val = s_val;

An explicit cast is not necessary, but might be used to suppress compiler warnings.

As for physical bit pattern (i.e. what you see in raw memory, bits of object-representation), you simply can't "convert" it that way. C++ language does not provide you with any methods of conversion that would guarantee to preserve the physical bit pattern. All you can do is to reinterpret the memory occupied by the signed object as an unsigned object of the same size

STATIC_ASSERT(sizeof(int64_t) == sizeof(uint64_t));
uint64_t u_val = reinterpret_cast<uint64_t&>(s_val);

Again, this is not a conversion, but rather a memory reinterpretation. This is not guaranteed to work and this is generally illegal.

AndreyT
Good point to distinguish "conversion" from "interpretation"!
xtofl
How is it illegal? What do you mean by 'work'? You _can_ `reinterpret` back, can't you?
xtofl
It is illegal becuse C++ language explcitly prohibits accessing memory occupied by an object of type `T` as an object of different type `U` (with few exceptions). In other words, reading reinterpreted memory is almost always illegal.
AndreyT
yes, good point about "conversion" and "interpretation." Title of question now uses correct terminology.
bbg
+1  A: 

Note that you don't need the cast at all. For all the wrangling about whether the cat will munge bits or not for negative representations, one thing has gotten lost - the cat is completely unnecessary.

Because of the conversions that C/C++ will do (and how casting is defined), this:

int64_t s_val = SOME_SIGNED_VALUE;
uint64_t u_val = s_val;

is exactly equivalent to:

int64_t s_val = SOME_SIGNED_VALUE;
uint64_t u_val = static_cast<uint64_t>(s_val);

That said, you might still want the cast because it signals intent. However, I've heard it argued that you shouldn't use unnecessary casts because it can silence the compiler in situations where you might want a warning.

Pick your poison.

Michael Burr