views:

401

answers:

6

An interesting problem I ran into today: what is the fastest way to count the number of 1s in an n-bit integer? Is it possible to beat O(n)?

For example:

42 = 0b101010 => 3 ones
512 = 0b1000000000 => 1 one

Clearly, the naive algorithm is to simply count. But, are there any tricks to speed it up?

(This is merely an academic question; there is no expected performance gain by implementing such a strategy.)

+2  A: 

If you have a finite number of bits (eg 32 bits) you can precalcualte it and then just look up the value in an array.

A slightly more practical way is to do this for each byte or word (only takes 256/64k bytes) and then add the results for each byte/word in the value

Martin Beckett
Except it would be hard with 32 bits… Assuming you used 1 bit per value… Would require 512 megs of storage :( Byte/word precalculation would be better.
David Wolever
David, you can't have 1 bit per value if you need to store a bitcount from 0 through 32 inclusive so it's worse than that.
paxdiablo
Works pretty well 8 or 16 bits at a time though.
hobbs
+16  A: 

See the fabulous Bit Twiddling Hacks article.

Ates Goral
+1 excellent reference for this sort of thing.
Stephen Canon
Great article - thanks very much.
carl
You're welcome. We should thank Delicious and its bookmark tagging capability for allowing me to pull out the link in a jiffy!
Ates Goral
+4  A: 

Probably the fastest way on x86 processors would be to use the POPCNT class of instructions.

bdonlan
If they have POPCNT, that is; only the most recent Intel processors do.
Stephen Canon
+3  A: 

The fastest way (without using special processor features or storing pre-calculated answers) is to AND your value with value - 1 in a loop until it's 0. The number of iterations is the number of 1's.

On Freund
You mean 2^n, don't you?
Basilevs
I'm not sure I follow. Where do you think I mean 2^n?
On Freund
This is only "the fastest way" when you know that the number of 1 is low. If the number of 1's is high, this method becomes extremely slow. The best way of solving this problem by actually *counting* the 1's is the well-known shift-and-add method that does the job in `log n` operations guaranteed. (Of course, in practice a byte-wise table-based method might prove the best, which is probably what you call "pre-calculating")
AndreyT
Actually, this algorithm's worst case running time is log n, just like the constant running time of the naive approach. As I mentioned in my answer, the number of iterations equals the number of 1's, and since there cannot be more than log n 1's, and each iteration takes constant time, the running time cannot be more than log n.
On Freund
@On Freund: You missed the fact that in this thread `n` stands for the number of bits in the number (see the OP), not for the number itself. Your algorithm takes O(n) time, while shift-and-add approach takes O(log n).
AndreyT
If you refer to the number of bits as n, then your approach (shift and add) will take O(n) just like mine. How exactly do you achieve lower running time than O(n) with shift and add? You need n shifts...
On Freund
@On Freund: By shift-and-add I meant not the "naive" approach, but rather the "Counting bits set, in parallel" approach from here http://graphics.stanford.edu/~seander/bithacks.html
AndreyT
"Counting bits set, in parallel" does use pre-calculation, namely the B array.
On Freund
@On Freund: No. The `B` array is just a bunch of bit masks. There's absolutely no reason to store these flags in an array, just like there's no reason to store shifts in `S` array. I don't know why they did it this way. To call this "pre-calculation" is the same as to call constants `-1` and `0` in your solutuioin as "pre-calculation".
AndreyT
The size of the storage for the bit masks (be it an array or anything else) depends on n (O(n*logn)), quite a difference from the two constants in "my" solution.Calling B "just a bunch of bit masks" is equivalent to calling the lookup table "just a bunch of numbers".
On Freund
A: 

Just a link for a method posted on gowrikumar.com

bbg
A: 

O(log n), if you don't go beyond machine words and disregard the fact that each machine operation operates on n bits.

In practice you should use library functions instead of twiddling the bits yourself, for example Integer.bitCount() in Java.

starblue