What is the fastest way to read the Left-Most bit from unsigned int ?


+5  A: 

On a 32-bit system using any compiler that a normal human would use, without odd, esoteric edge cases that C++ geeks can't seem to avoid getting excited about, planet Earth, circa 2010, stars unaligned, a normal day:

if (value & 0x8000000) { ... }
Ben M
That rather assumes the size of int :-).
Douglas Leeder
What if he's on a 16-bit CPU?
Sorry, sorry, I made a 99.9% assumption. Answer is now qualified.
Ben M
Stephen Canon
I don't consider the millions of embedded devices as "edge cases." I don't consider your answer worth a downvote, though.
San Jacinto
I doubt there will ever be a time (as long as C++ and C are widely used) that all common systems agree on the size of an `int`.
David Thornley
@San Jacinto, maybe you missed _On a 32-bit system_, i.e., not a 16- or 8-bit embedded device.
Ben M
+1 for disclaimer. :D
Furthermore - while I agree that Tyler McHenry's answer is the most comprehensive in terms of its applicability to any platform, I think it's far less readable than this one (or the 16- or 8-bit equivalents). How much code is really ported between targets having different sizes of `int`, and how relevant is that going to be to a test like this, which in all likelihood is performed against data of some canonical length, i.e. 16- or 32-bits regardless of the platform it is ported to? Would any of _you_ use Tyler's construct in your code?
Ben M
@Ben in fact, I would use Tyler's. The fact is that at work, we periodically (twice annually?) port code between linux-based ARM systems and 8051, and sometimes to Rabbit micros. It's not that you do it often, it's that you'll do it at all. If you don't plan for it right away, you're in trouble when the time comes.
San Jacinto
@Ben Actually, I just read your "disclaimer" as saying anything other than a 32-bit system is an "edge-case." Doesn't matter anyway, I didn't downvote :)
San Jacinto
"value" is of unknown type so 0x80000000 is a bit to "generic". Better to create a signed type of the same type as value and set it to -1 which is 0x8... and use it as a mask (templates can come in handy).
+18  A: 
i >> (sizeof(unsigned int) * CHAR_BIT - 1)

The sizeof, multiplication, and subtraction will be computed at compile-time by any reasonable compiler, so this should become a single right-shift instruction, which is about as fast as you will get.

Tyler McHenry

Well that depends on the size of int, endianness, and whether you want the left most bit based on orientation within memory or the most significant bit.

Guessing that you want the most significant bit, which on a little endian, 32 bit machine (the most common kind), would be in the fourth byte from the ints location in memory.:

i & 0x80000000

Now the register is a boolean for the presence of the MSB bit.

It might be possible to bit shift to the right, this may generate less object code:

i >> (sizeof(i) * CHAR_BIT - 1)


Maybe some aren't aware of what I mean above. On the example architecture I gave you could also do this:

((unsigned char *)&i)[3] >> 7
((unsigned char *)&i)[3] & 0x80
Matt Joiner
"25th bit from the left in physical memory" ??? I'm assuming that was a tongue-in-cheek joke. Also, neither byte-order nor bit-order matter here. The OP's (imprecise) use of the word "left-most bit" when applied to an int has nothing to do with the memory storage of that int.
Not at all. The MSB bit is stored in the 4th byte on a little endian machine. So the MSB is either the 25th bit, or the 32nd bit, depending on how bits are arranged with bytes (but this doesn't matter as single bytes are atomic).
Matt Joiner
"as single bytes are atomic"... I was also hoping the 25th bit bit was tongue-in-cheek as well. What if the bits are arranged in a circle, physically? Is it my left, or the computers left? (After all, my computer faces me!)
Yes, very funny... Perhaps I went overboard with the explanation, but the fact remains the OP used the term "left most bit", rather than "most significant bit".
Matt Joiner
+7  A: 

Might be faster than shifting at run-time, if AND is faster than shifting:

i & (1 << (sizeof(unsigned int) * CHAR_BIT - 1))
Douglas Leeder
Note that this requires a large constant, which needs to be embedded in either your code or constant data. On many ISA's, shifts over fixed lengths can be encoded with a single instruction. So even thought the AND _instruction_ itself may be faster, the memory/cache overhead will likely render it slower.
True; the only way (if you really cared!) would be to benchmark the two approaches on your target platform. But I doubt it matters in actual programs. Chose whichever approach you find clearest.
Douglas Leeder
+3  A: 
#define MSB ~(~0U >> 1 )

breaking it down assuming 8 bits int for example purposes.

0U = 00000000b
~0U = 11111111b
~0U >> 1 = 01111111b
~(~0U >> 1) = 10000000b

then AND this value with what you want to test ( cast as unsigned int )

(unsigned int ) a & MSB;
The behavior of `>>` on a negative value is implementation-defined, and `~0` is negative. Change that `0` to `0U` and you're fine, since there are no negative unsigned numbers.
David Thornley
Ta for that. Fixed!
+2  A: 

You could of course also try this:

((int) myUint) < 0 // true if set, otherwise false

And use the fact, that for signed integers, the left most bit is set for negative numbers.

This should also be pretty fast, since the cast is a compile-time thing, and does not really have to be executed - it just tells the compiler to use the signed opcodes as opposed to the unsigned ones. So I believe a single instruction (CMP?) needs to be executed...

Daren Thomas
+1 but... This is faster, or as fast, as all the other solutions but it does assume you're on a CPU that uses the left most bit as the sign bit. Probably a safe assumption.

How about this?

i >> numeric_limits<unsigned>::digits - 1;
Assumes `numeric_limits<unsigned>::radix == 2`. Which is likely true everywhere.
Can it be other than 2? If so, non of the other answers are correct. Anyway, radix != 2 is so unnatural, that it is safe to assume that all C++ implementations use radix=2. Right?