views:

12447

answers:

14

I've seen this pattern used a lot in C & C++.

unsigned int flags = -1;  // all bits are true

Is this a good portable way to accomplish this? Or is using 0xffffffff or ~0 better?

+1  A: 

It is certainly safe, as -1 will always have all available bits set, but I like ~0 better. -1 just doesn't make much sense for an unsigned int. 0xFF... is not good because it depends on the width of the type.

Zifre
" 0xFF... is not good because it depends on the width of the type"Which is the only sane way to go, I think. You're supposed to define clearly what each flag/bit means in your program. So, if you define that you're using the lowest 32 bits to store flags, you should restrict yourself to use those 32 bits, whether the actual size of the int is 32 or 64.
Juan Pablo Califano
+7  A: 

I'm not sure using an unsigned int for flags is a good idea in the first place in C++. What about bitset and the like?

std::numeric_limit<unsigned int>::max() is better because 0xffffffff assumes that unsigned int is a 32-bit integer.

Edouard A.
I like this because of its standard, but it's too wordy and it makes you state the type twice. Using ~0 is probably safer since 0 can be any integer type. (Although I'm aware it smells too much of C.)
Marcus Lindblom
The fact it is wordy can be seen as an advantage. But I like ~0 as well.
Edouard A.
You can mitigate the wordiness with the standard UINT_MAX macro, since you're hardcoding the type unsigned int anyway.
Roger Pate
+5  A: 

Practically: Yes

Theoretically: No.

-1 = 0xFFFFFFFF (or whatever size an int is on your platform) is only true with two's complement arithmetic. In practice, it will work, but there are legacy machines out there (IBM mainframes, etc.) where you've got an actual sign bit rather than a two's complement representation. Your proposed ~0 solution should work everywhere.

Drew Hall
I said that too. But then I realised I was wrong, since -1 signed always converts to max_value unsigned under the conversion rules, regardless of the value representations. At least, it does in C++, I don't have the C standard to hand.
Steve Jessop
there is an ambiguity. -1 is not 0xFFFFFFFF. But -1 is 0xFFFFFFFF if converted to an unsigned int (having 32 bits). That's what is making this discussion so difficult i think. Many people have very different things in mind when they talk about those bit strings.
Johannes Schaub - litb
+16  A: 

Frankly I think all fff's is more readable. As to the comment that its an antipattern, if you really care that all the bits are set/cleared, I would argue that you are probably in a situation where you care about the size of the variable anyway, which would call for something like boost::uint16_t, etc.

Doug T.
+1, good point.
j_random_hacker
There are a number of cases in which you don't care that much, but they're rare. E.g. algorithms that work on datasets of N bits by breaking it down into chunks of sizeof(unsigned)*CHAR_BIT bits each.
MSalters
+1. Even if the size of the datatype is bigger than the # of F's (i.e., you didn't quite set all the bits to true), since you're explicitly setting the value, you are at least aware of which bits are "safe to use"..
Mark
+53  A: 

I recommend you to do it exactly as you have shown, since it's the most straight forward one. Initialize to -1 which will work always, independent of the actual sign representation, while ~ will sometimes have surprising behavior because you will have to have the right operand type. Only then you will get the most high value of an unsigned type.

For an example of a possibly surprise, consider this one:

unsigned long a = ~0u;

It won't necessarily store a pattern with all bits 1 into a. But it will first create a pattern with all bits 1 in an unsigned int, and then assign it to a. What happens when unsigned long has more bits is that not all of those are 1.

And consider this one, which will fail on a non-two's complement representation:

unsigned int a = ~0; // Should have done ~0u !

The reason for that is that ~0 has to invert all bits. Inverting that will yield -1 on a two's complement machine (which is the value we need!), but will not yield -1 on another representation. On a one's complement machine, it yields zero. Thus, on a one's complement machine, the above will initialize a to zero.

The thing you should understand is that it's all about values - not bits. The variable is initialized with a value. If in the initializer you modify the bits of the variable used for initialization, the value will be generated according to those bits. The value you need, to initialize a to the highest possible value, is -1 or UINT_MAX. The second will depend on the type of a - you will need to use ULONG_MAX for an unsigned long. However, the first will not depend on its type, and it's a nice way of getting the most highest value.

We are not talking about whether -1 has all bits one (it doesn't always have). And we're not talking about whether ~0 has all bits one (it has, of course).

But what we are talking about is what the result of the initialized flags variable is. And for it, only -1 will work with every type and machine.

Johannes Schaub - litb
why is -1 guaranteed to be converted to all ones? Is that guaranteed by the standard?
jalf
the conversion that happens is that it repeatedly adds one more than ULONG_MAX until it is in range (6.3.1.3 in the C TC2 draft). In C++ it's the same, just using another way formalizing it (modulo 2^n). It all comes down to mathematical relations.
Johannes Schaub - litb
+1, very clear explanation. I was surprised to see that yes, the C++ standard makes a guarantee about the result of converting a negative signed number to an unsigned type, given the number of things that are left as "implementation-defined."
j_random_hacker
instead of `-1`, you could also use `UINTMAX_MAX` from stdint.h as a type-agnostic initialisation value; but one would assume that the programmer knows the number of significant bits of the flags variable, so there's nothing wrong with `0xff...` either
Christoph
Sometimes, it's convenient to use the highest value for something special. Like std::string::npos, which is defined as static_cast<size_t>(-1) . The number of bits is not necessarily needed in such cases.
Johannes Schaub - litb
@litb: casting -1 is certainly a nice way to get maximal unsigned values, but it's not really descriptive; that's the reason why the _MAX constants exist (SIZE_MAX was added in C99); granted, the C++ version `numeric_limits<size_t>::max()` is a bit long-winded, but so is the cast...
Christoph
I think this answer's right in practice, but I can't see an absolute guarantee in the Standard that 2^N - 1 is all bits set for an unsigned type. I'm adding an answer to that effect.
James Hopkin
Beware of ones' complement representation (seen in a few obscure DSP chips.)
finnw
"We are not talking about whether -1 has all bits one (it doesn't always have). And we're not talking about whether ~0 has all bits one (it has, of course)." -- whaat??? I thought the whole point *was* to set all bits to 1. That's how flags work..no?? You look at the *bits*. Who cares about the value?
Mark
@Mark the questioner cares. He asks "Is it safe to use -1 to set all bits to true". This does not ask about what bits `-1` is represented by, nor does it ask what bits `~0` has. We may not care about values, but the compiler does. We can't ignore the fact that operations work with and by values. The *value* of `~0` may not be `-1`, but this is the value you need. See my answer and @Dingo's summary.
Johannes Schaub - litb
@Johannes: Well.. why is the "highest value" guaranteed to have "all bit set" then?
Mark
@Mark because the standard integers are using a pure binary system for counting. If you always count up.. you end at all bits 1 some day :) You can argue an implementation could define the highest unsigned number unequal to `2**n - 1`, but then such an impl would have a problem with implementing the conversion of `-1` to `unsigned`, which precisely is required to result in a value of `2**n - 1` with `n` being the amount of bits in the unsigned type.
Johannes Schaub - litb
Someone posted this answer on twitter yesterday. It was binary day yesterday, date 101010 . I guess it will be more fun next year at 111111 HAHAHA.
Johannes Schaub - litb
+3  A: 

I would not do the -1 thing. It's rather non-intuitieve (to me at least). Assigning signed data to an unsigned variable just seems to be a volation of the natural order of things.

In your situation, I always use 0xFFFF. (Use the right number of Fs for the variable size of course.)

[BTW, I very rarely see the -1 trick done in real-world code.]

Additionally, if you really care about the individual bits in a vairable, it would be good idea to start using the fixed-width uint8_t, uint16_t, uint32_t types.

msemack
+2  A: 

As long as you have #include <stdint.h> as one of your includes, you should just use

unsigned int flags = UINT_MAX;

If you want a long's worth of bits, you could use

unsigned long flags = ULONG_MAX;

These values are guaranteed to have all the value bits of the result set to 1, regardless of how signed integers are implemented.

Michael Norrish
the constants you suggested are actually defined in limits.h - stdint.h contains the limits for the additional integers types (fixed-sized integers, intptr_t,...)
Christoph
+23  A: 

unsigned int flags = -1; is portable.

unsigned int flags = ~0; isn't portable because it relies on a two's-complement representation.

unsigned int flags = 0xffffffff; isn't portable because it assumes 32-bit ints.

If you want to set all bits in a way guaranteed by the C standard, use the first one.

Dingo
How does ~0 (i.e. the one's complement operator) rely on two's complement representation?
Drew Hall
You have this backwards. Its setting flags to -1 that relies on a twos complement representation. In a sign+magnitude representation minus one only has two bits set: the sign bit and the least significant bit of the magnitude.
Stephen C. Steel
The C standard requires that int value of zero has its sign bit and all value bits are zero. After the one's complement, all those bits are one. The values of an int with all bits set are:Sign-and-magnitude: INT_MINOne's complement: -0Two's complement: -1So the statement "unsigned int flags = ~0;" will assign whichever value above matches the platform's integer representation. But the '-1' of two's complement is the only one that will set all the flags bits to one.
Dingo
@Stephen: Agreed on the representation. But when an int value is assigned to an unsigned int, the unsigned doesn't get its value by adopting the int value's internal representation (except in two's-complement systems where that generally works). All values assigned to unsigned int's are modulo (UINT_MAX + 1), so assigning -1 works no matter the internal representation.
Dingo
@Stephen: dingoatemydonut's right -- surprisingly enough, C++ guarantees this, it's in section 4.7, paragraph 2 of the ISO standard.
j_random_hacker
Mark
@Mark There is no intended distinction between "all bits" and "all the flag bits". They are the same.The expression "flags = ~0" will not always work if flags is unsigned. The simplest example is on a one's-complement platform. On such a platform "~x == -x" is true for signed integers. Therefore, "~0 == -0" is true. On that platform, "flags = ~0" is the same as "flags = -0", which in turn is the same as "flags = 0". So on a one's-complement platform "flags = ~0" results in no bits being set.
Dingo
@Dingo: Oh...that clarifies it.. but also makes no sense at all. Not quite sure how "~0 == 0 == no bits set".. that just sounds like an error. The ~ is supposed to invert the bits.. it seems like it's failing to do so??
Mark
@Mark: you're confusing two operations. `~0` yields an `int` value with all bits set, of course. But assigning an `int` to an `unsigned int` does not *necessarily* result in the unsigned int having the same bit pattern as the signed bit pattern. Only with a 2's complement representation is this always the case. On a 1s' complement or sign-magnitude representation, assigning a negative `int` value to an `unsigned int` results in a different bit pattern. This is because the C++ standard defines signed -> unsigned conversion to be the modulo-equal value, not the value with the same bits.
Steve Jessop
@Steve: very nice explanation.
R..
@Steve: Oh.. right. The assignment changes things I guess. Thanks!
Mark
+7  A: 

A way which avoids the problems mentioned is to simply do:

unsigned int flags = 0;
flags = ~flags;

Portable and to the point.

hammar
+2  A: 

On Intel's IA-32 processors it is OK to write 0xFFFFFFFF to a 64-bit register and get the expected results. This is because IA32e (the 64-bit extension to IA32) only supports 32-bit immediates. In 64-bit instructions 32-bit immediates are sign-extended to 64-bits.

The following is illegal:

mov rax, 0ffffffffffffffffh

The following puts 64 1s in RAX:

mov rax, 0ffffffffh

Just for completeness, the following puts 32 1s in the lower part of RAX (aka EAX):

mov eax, 0ffffffffh

And in fact I've had programs fail when I wanted to write 0xffffffff to a 64-bit variable and I got a 0xffffffffffffffff instead. In C this would be:

uint64_t x;
x = UINT64_C(0xffffffff)
printf("x is %"PRIx64"\n", x);

the result is:

x is 0xffffffffffffffff

I thought to post this as a comment to all the answers that said that 0xFFFFFFFF assumes 32 bits, but so many people answered it I figured I'd add it as a separate answer.

Nathan Fellman
+2  A: 

See litb's answer for a very clear explanation of the issues.

My disagreement is that, very strictly speaking, there are no guarantees for either case. I don't know of any architecture that does not represent an unsigned value of 'one less than two to the power of the number of bits' as all bits set, but here is what the Standard actually says (3.9.1/7 plus note 44):

The representations of integral types shall define values by use of a pure binary numeration system. [Note 44:]A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral power of 2, except perhaps for the bit with the highest position.

That leaves the possibility for one of the bits to be anything at all.

James Hopkin
In general, we can't be sure about the value of the padding bits. And if we want, then we could be in danger since we could generate a trap representation for them (and it could raise signals). However, the std requires unsigned char to have no padding bits, and in 4.7/2 in the c++ standard says that converting an integer to an unsigned type, the value of the resulting unsigned variable is the smallest value congruent to the source integer value, (modulo 2^n, n==number-of-bits in the unsigned type). then (-1)==((2^n)-1) (mod 2^n) . 2^n-1 has all bits set in a pure binary numbering system.
Johannes Schaub - litb
If we *really* want to have all bits 1 in the object representation of an unsigned type, we would need memset. But we could generate a trap representation thereby :( Anyway, an implementation probably has no reason to throw a bit away of their unsigned integers so it will use it to store its values. But you have got a very good point - there is nothing stopping an interpretation from having a few silly nonsense bits, i think (apart of in char/signed char/unsigned char, which must not have those). +1 of course :)
Johannes Schaub - litb
In the end, i think the standard could be clearer what representation it refers to in 4.7/2. If it refers to the object representation, then there is no place for padding bits anymore (i have seen people arguing like that which i don't see anything wrong with). But i think it talks about the value representation (because everything in 4.7/2 is about values anyway - and then padding bits may nest next to the value bits.
Johannes Schaub - litb
The Standard seems to pretty clearly have '2’s complement, 1’s complement and signed magnitude' representations in mind, but doesn't want to rule anything out. Interesting point about trapping representations too. As far as I can tell, the bit I quoted is the definition of 'pure binary numeration system' as far as the Standard is concerned - the 'except' bit at the end is really my only doubt over whether casting -1 is guaranteed to work.
James Hopkin
+1  A: 

Converting -1 into any unsigned type is guaranteed by the standard to result in all-ones. Use of ~0U is generally bad since 0 has type unsigned int and will not fill all the bits of a larger unsigned type, unless you explicitly write something like ~0ULL. On sane systems, ~0 should be identical to -1, but since the standard allows ones-complement and sign/magnitude representations, strictly speaking it's not portable.

Of course it's always okay to write out 0xffffffff if you know you need exactly 32 bits, but -1 has the advantage that it will work in any context even when you do not know the size of the type, such as macros that work on multiple types, or if the size of the type varies by implementation. If you do know the type, another safe way to get all-ones is the limit macros UINT_MAX, ULONG_MAX, ULLONG_MAX, etc.

Personally I always use -1. It always works and you don't have to think about it.

R..
FWIW, if I mean “all 1 bits” I use `~(type)0` (well, fill in the right `type` of course). Casting zero still results in a zero, so that's clear, and negating all the bits in the target type is pretty clearly defined. It's not that often that I actually want that operation though; YMMV.
Donal Fellows
@Donal, that surely works, but requires knowing the type and writing it out directly. Perhaps this would be better: `var=~(0*var);` since it won't break if the type changes. I still prefer -1.
R..
-1 has the assumption that you're working with twos-complement arithmetic. Almost everything does nowadays, but not all.
Donal Fellows
@Donal: you are simply wrong. C specifies that, when converting a value that does not fit into an unsigned type, the values is reduced modulo 2^n where n is the number of bits in destination type. This applies both to signed values and larger unsigned types. It has nothing to do with twos complement.
R..
@R..: You've described what happens with twos-complement architectures (the overwhelmingly most common) but there's also ones-complement and sign-value, and to expect that those implement sign/bit conversions the same is just unrealistic, no matter what the standard says. After all, it's “just a standard” and compliance is not usually valued as much as supporting old code.
Donal Fellows
And as noted, I use the form from earlier in this exchange. That's not because it is shorter, but because it says “this is working with bits” to me whereas `-1` says “this is working with the value”.
Donal Fellows
They **do** implement it the same, which is trivial; you just use an unsigned subtraction opcode instead of a signed one or a `neg` instruction. Machines which have bogus signed arithmetic behavior have separate signed/unsigned arithmetic opcodes. Of course a really good compiler would just always ignore the signed opcodes even for signed values and thereby get twos-complement for free.
R..
A: 

yes the representation shown is very much correct as if we do it the other way round u will require an operator to reverse all the bits but in this case the logic is quite straightforward if we consider the size of the integers in the machine

for instance in most machines an integer is 2 bytes = 16 bits maximum value it can hold is 2^16-1=65535 2^16=65536

0%65536=0 -1%65536=65535 which corressponds to 1111.............1 and all the bits are set to 1 (if we consider residue classes mod 65536) hence it is much straight forward.

I guess

no if u consider this notion it is perfectly dine for unsigned ints and it actually works out

just check the following program fragment

int main() {

unsigned int a=2;

cout<<(unsigned int)pow(double(a),double(sizeof(a)*8));

unsigned int b=-1;

cout<<"\n"<<b;

getchar();

return 0;

}

answer for b = 4294967295 whcih is -1%2^32 on 4 byte integers

hence it is perfectly valid for unsigned integers

in case of any discrepancies plzz report

ankit sablok
Two comments: first, you are dead wrong about the size of integers on “most” machines. Secondly, ur txt iz very hard 2 read due 2 ur us of sum kind of seckrit language. Plz us *plain English*.
Konrad Rudolph
A: 

I say:

int x;
memset(&x, 0xFF, sizeof(int));

This will always give you the desired result.

Alex