ansaurus

Question

Fastest way to enumerate through turned on bits of an integer

Answer 1

+1 A:

I guess bit shifting (<<) is the fastest.

SeasonedCoder 2009-05-08 03:29:58

Answer 2

+8 A:

I imagine bit-shifting would be the fastest. Untested, but I think the following ought to be fast (as fast as IEnumerables are at least). It uses a 1-based iterator so you can quickly return the exponent without adding one every time.

IEnumerable<int> GetExponents(Int32 value)
{
    for(int i=0; i<32; i++)
    {
        if(value & 1)
            yield return i;
        value >>= 1;
    }
}

If you want it to be faster, you might consider returning a List<int> instead.

lc 2009-05-08 03:34:10

Can this be done in linq??? int32Value.ForEach(....) ???

Pure.Krome 2009-05-08 04:35:58

I'm not exactly sure what you mean by int32Value but in order to use the ForEach extension method, you have to have an IList. You can call ToList() on an IEnumerable to get one if you'd like. And if you want, you can make the above code an extension method and call myInt.GetExponents().ToList().ForEach(...)

lc 2009-05-08 04:49:28

You can try aggregate in linq.

leppie 2009-05-08 08:54:14

Shouldn't your loop start at zero? ie, The first bit represents exponent 0, the second bit represents exponent 1 and so on. (2^0 = 1, 2^1 = 2, 2^2 = 4 etc).

LukeH 2009-05-08 11:38:27

Right...and that would be me overthinking the problem on not enough sleep. Fixed.

lc 2009-05-10 09:52:49

You could also make this an extension method.

geofftnz 2009-05-10 22:12:41

Yeah that's buried in my comment a few lines above.

lc 2009-05-11 00:41:46

Answer 3

+5 A:

A lookup array for one byte's worth of bits ought to be close to the fastest you can do in safe C# code. Shift each of the 4 bytes out of the integer (casting to uint as necessary) and index into the array.

The elements of the lookup array could be an array of exponents, or, depending on what you're doing with the bits, perhaps delegates that do work.

Barry Kelly 2009-05-08 03:48:31

++ I like this, except that then it becomes an issue of efficiently collecting the results. Each element of the lookup table can be an array of exponents, but then you have to concatenate them, and add 8, 16, and 24 to each array. Not sure how many cycles that would take.

Mike Dunlavey 2009-05-10 21:59:56

Answer 4

A:

Fastest given what distribution for the input? If typically only one bit is set, then this could be faster than looping through looking for set bits.

Taking from the accepted answer for finding the position of the least significant bit, which took from Bit Twiddling Hacks, this solutions loops, finding, clearing and returning the position of each consecutive least-significant-bit.

   static int[] MulDeBruijnBitPos = new int[32] 
    {
      0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
      31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
    };

    static IEnumerable<int> GetExponents(UInt32 v)
    {
        UInt32 m;

        while( v != 0 ) {
          m = (v & (UInt32) (-v));
          yield return MulDeBruijnBitPos[((UInt32) (m * 0x077CB531U)) >> 27];
          v ^= m;
        }
    }

It only loops as many times as there are bits set.

Tony Lee 2009-05-08 04:14:51

A far simpler and clearer solution is to use a small (either 16 or 256 entries) lookup table.

Sinan Ünür 2009-05-08 12:18:47

It's really hard to beat lc's answer for simplicity and clarity.

Tony Lee 2009-05-08 15:44:33

Answer 5

+25 A:

The fastest way? Lookup tables are almost always the fastest way. Build an int[][] array with four billion entries, one for each int, containing an array of the numbers you want. Initializing the table will take some time, of course but the lookups will be incredibly fast.

I note that you have not said what "fastest" means with sufficient precision for anyone to actually answer the question. Does it mean fastest amortized time including startup time, or marginal lookup time assuming that startup costs can be neglected? My solution sketch assumes the latter.

Obviously a 32 bit machine with 2 billion bytes of address space will not have enough address space to store thirty billion bytes of arrays. Get yourself a 64 bit machine. You'll need at least that much physical memory installed as well if you want it to be fast -- the paging is going to kill you otherwise.

I sure hope the couple of nanoseconds you save on every lookup are worth buying all that extra hardware. Or maybe you don't actually want the fastest way?

:-)

Eric Lippert 2009-05-08 06:09:08

lol, great response!

Brannon 2009-05-08 06:17:23

What an arsehole response... but great! :)

Charlie Somerville 2009-05-08 08:55:34

What are you, comicbook guy? Obviously, he wanted the fastest *reasonable* method.

Blorgbeard 2009-05-08 12:16:21

Then someone needs to define "reasonable". Once you define "reasonable", you have a performance _specification_. "Fastest possible" is not a specification. By thinking about what the costs are of the fastest possible implementation, you start to realize that performance is not _ever_ about "fastest possible". It's about obtaining a reasonable performance return on an investment. Performance tuning is expensive; without a realistic spec, you can spend a lot of money trying to reach a level of performance you don't actually need.

Eric Lippert 2009-05-08 17:05:42

Talking of investement.. It comes from the guys who made us upgrade 195683 times in the last 7 years and still pushes on the idea that 'stack is an implementation' detail. eah it is, of a thing called CLR, and 9584 threads running on NT at boot time. How about you explain to the author why your lookup table will be hammering RAM and destroying locality much like all those Object roots do in all your bloated frameworks and produce such a penalty that nothing out there is safe from it. As if bringing in requirements of supercomputers for a listbox, 1 query and button wasn't enough. CrusadersTM

rama-jka toti 2009-05-09 10:38:47

The only table and software I keep upgrading is Google, BigTable and their browser :) Bunch of kids pretty much ran down the lets copy Java mentality of 1998 MSFT to date. That tells a lot and it is not getting better not doing look up tables and avoiding native infrastructure, stacks and more. It scales and Saves The World Some Blotware Energy TM. They scaled from 2 to 20,000 employees and still batter your attempts which is pretty remarkable. OTOH, my app scaled-down from 20000 to 2, same piece on WPF vs NT3.51. Incredible... Good investment?

rama-jka toti 2009-05-09 10:50:27

Huge lookup tables are quite frequently not the fastest way to implement stuff. Since we almost certainly get a cache miss a lookup will need in the order of hundred(s) of cycles.Any of the other proposed methods beats that easily.

Accipitridae 2009-05-09 10:56:17

But he was trying to be sarcastic :) Of course that the fastest is fastest in runtime cost and when it scales to problem domain and functional requirement foremost; trade-off for best of both worlds. But that always includes stack, precious cache, SIMD and MIMD and more that bloggers keep trying to 'abstract' and still can't even JIT or do in PFX properly. The way modern and managed abstract thinking scales turns negative, rapidly and *exactly* in what sarcasm was attempted at.

rama-jka toti 2009-05-09 11:06:20

You know, lookup for reference is a 'key', the one msdn blogs pushes so happily since 2003. Everyone with a bit of brain can see that clearly and it is getting worse and worse.. Making us upgrade 300MB frameworks, 6GB OS-es, RAM and machines doesn't work any longer does it..

rama-jka toti 2009-05-09 11:07:10

@Majkara: Sure it does. Memory is still pretty much following Moore's Law, whereas clock speeds aren't. It makes sense to trade more and more memory for speed in everyday scenarios.

mquander 2009-05-13 15:16:38

Wow, nearly every comment/answer by Tito I've read has been some inane rant about Java/.NET/"modern is bad"/"managed code sucks". You really have a chip on your shoulder about this, don't you?Bah, humbug.

JulianR 2009-05-20 19:39:04

@Majkara Tito: HIs lookup table won't hammer RAM if he builds the lookup table in hardware.

Brian 2009-06-03 13:37:12

Answer 6

+2 A:

Just for fun, here's a one-liner using LINQ.

It's certainly not the fastest way to do it, although it's not far behind the other answers that use yield and iterator blocks.

int test = 42;

// returns 1, 3, 5
//   2^1 + 2^3 + 2^5
// =   2 +   8 +  32
// = 42
var exponents = Enumerable.Range(0, 32).Where(x => ((test >> x) & 1) == 1);

For a speedier solution I would probably return a plain collection rather than using an iterator block. Something like this:

int test = 2458217;

// returns 0, 3, 5, 6, 9, 15, 16, 18, 21
//   2^0 + 2^3 + 2^5 + 2^6 + 2^9 +  2^15 +  2^16 +   2^18 +    2^21
// =   1 +   8 +  32 +  64 + 512 + 32768 + 65536 + 262144 + 2097152
// = 2458217
var exponents = GetExponents(test);

// ...

public static List<int> GetExponents(int source)
{
    List<int> result = new List<int>(32);

    for (int i = 0; i < 32; i++)
    {
        if (((source >> i) & 1) == 1)
        {
            result.Add(i);
        }
    }

    return result;
}

LukeH 2009-05-08 10:21:52

+1 for the one-liner version

Tony Lee 2009-05-08 17:05:39

Answer 7

A:

Apologies for the C implementation. I am going to learn C# one of these days

I make no statement that this is the 'fastest'. For that, you are going to have to benchmark a variety of solutions with the type of data you expect and the usage scenario you plan.

#include <stdio.h>


int numbits(unsigned int n) {
    static int lookup_table[] = {
        /* 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f */
           0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
    };

    return 
        lookup_table[ (n & 0xF0000000) >> 28 ] +
        lookup_table[ (n & 0x0F000000) >> 24 ] +
        lookup_table[ (n & 0x00F00000) >> 20 ] +
        lookup_table[ (n & 0x000F0000) >> 16 ] +
        lookup_table[ (n & 0x0000F000) >> 12 ] +
        lookup_table[ (n & 0x00000F00) >>  8 ] +
        lookup_table[ (n & 0x000000F0) >>  4 ] +
        lookup_table[ (n & 0x0000000F) ];
}

int main(void) {
    int i;

    unsigned int testvals[] = {
        0,
        0x01010101,
        0xf0f0f0f0,
        0xffffffff,
        0x11111111,
        0x22222222,
        0x44444444,
        0x99999999,
        0x5aa5a55a,
        0xdeadbeef,
    };

    for ( i = 0; i < sizeof testvals/sizeof testvals[0]; ++i ) {
        printf( "%8.8x\t%2d\n", testvals[i], numbits(testvals[i]));    
    }

    return 0;
}

/*
    C:\Temp> cl /nologo /O1 bits.c
    bits.c

    C:\Temp> bits
    00000000         0
    01010101         4
    f0f0f0f0        16
    ffffffff        32
    11111111         8
    22222222         8
    44444444         8
    99999999        16
    5aa5a55a        16
    deadbeef        24
*/

Sinan Ünür 2009-05-08 12:42:55

This just counts the number of bits that are set. It doesn't return the exponents.

LukeH 2009-05-08 13:14:32

I completely missed that in the problem description. Apologies and thank you for pointing it out.

Sinan Ünür 2009-05-08 17:22:35

Answer 8

+4 A:

The IEnumerable is not going to perform. Optimization of some examples in this topic:

First one (fastest - 2.35 seconds for 10M runs, range 1..10M):

static uint[] MulDeBruijnBitPos = new uint[32] 
{
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};

static uint[] GetExponents(uint value)
{
    uint[] data = new uint[32];
    int enabledBitCounter = 0;

    while (value != 0)
    {
        uint m = (value & (0 - value));
        value ^= m;
        data[enabledBitCounter++] = MulDeBruijnBitPos[(m * (uint)0x077CB531U) >> 27];
    }

    Array.Resize<uint>(ref data, enabledBitCounter);
    return data;
}

Another version (second fastest - 3 seconds for 10M runs, range 1..10M):

static uint[] GetExponents(uint value)
{
    uint[] data = new uint[32];
    int enabledBitCounter = 0;

    for (uint i = 0; value > 0; ++i)
    {
        if ((value & 1) == 1)
            data[enabledBitCounter++] = i;
        value >>= 1;
    }

    Array.Resize<uint>(ref data, enabledBitCounter);
    return data;
}

taoufik 2009-05-09 07:52:56

+1 for actually benchmarking

Tony Lee 2009-05-09 16:27:28

Answer 9

A:

If you won't choke on a little C++:

 void func(int i, int& n, int a[]){
  n = 0;
  if (i < 0) a[n++] = 31;
  i <<= 1;
  if (i < 0) a[n++] = 30;
  i <<= 1;
  if (i < 0) a[n++] = 29;
  i <<= 1;

  ...

  if (i < 0) a[n++] = 0;
}

Mike Dunlavey 2009-05-10 21:48:27

ansaurus

tags:

views:

answers:

Fastest way to enumerate through turned on bits of an integer

related questions