ansaurus

Question

Help me improve this C++ bit-buffer processing code

Answer 1

A:

 uint32_t temp = oldBuffer ^ newBuffer;
 uint32_t number = 0;
 uint32_t bitmask=1;
 while (temp != 0)
 {
     if (temp & 0x1)
     {
         processNumber(number, ((newBuffer & bitmask) != 0));
     }
     number++;
     temp = temp >> 1;
     bitmask <<=1;
 }
 oldBuffer = newBuffer;

2 super small changes...

your code was already quite efficient

Toad 2009-08-20 17:07:14

Answer 2

+8 A:

uint32_t temp=oldBuffer^newBuffer, ntemp=newBuffer;
for(int b=0;temp;++b,temp>>=1,ntemp>>=1)
    if(temp&1) processNumber(b,ntemp&1);

Michael Krelin - hacker 2009-08-20 17:13:25

+1, Some spaces would go a long way to improving the readability ;D

sixlettervariables 2009-08-20 17:30:29

Nice - one (extremely) small thing: the test for `b<32` is unnecessary. Stopping the loop when `test` goes to 0 is enough (dunno if a compiler would be smart enough to figure that out at optimization time).

Michael Burr 2009-08-20 17:31:22

Lou Franco 2009-08-20 17:40:14

Michael Krelin - hacker 2009-08-20 17:43:19

Lou, the number of shifts is the same, but the amount of bits differs.

Michael Krelin - hacker 2009-08-20 17:46:15

Shifting 1 and shifting N should be equivalent -- both are just hardwired into the silicon, right? Is there a fast shiftRightOne instruction? I think if there is improvement, it's in not hoping the optimizer will deal with the bitValue variable. Anyway, the code is simpler -- so, in that way -- definitely an improvement.

Lou Franco 2009-08-20 17:52:18

Lou may well have somethign there, but we sure are delving into some micro-optimizations (but sometimes that's just plain good recreation, even if it's not great engineering).

Michael Burr 2009-08-20 18:07:01

Lou, it depends on the architecture, but in general you're right. I do not think that the original code needs any optimization for performance reasons. But I do understand why Rich's aesthetics are hurt when looking at the `1<<number` code ;-) So, basically, you original comment was correct — it's not much of a performance boost.

Michael Krelin - hacker 2009-08-20 18:48:23

Answer 3

+4 A:

You could possibly gain some performance by using one or more of the 'bit twiddling' hacks on

http://graphics.stanford.edu/~seander/bithacks.html

Specifically, the algorithms to 'find the integer log base 2' (posisiton of the highest bit set). This would let you determine which bits are set more directly than looping over each bit.

If you must process the bits in in low to high order, you can use a slight modification of Kernighan's methods to counts bits:

/* note: untested code */
while (temp) {

    uint32_t bit = temp & (~(temp & (temp - 1)); /* isolate lowest bit */

    temp &= ~bit;

    uint32_t bit_number = /* use find log base 2 hack */;

    /* etc... */

}

This should make the while loop iterate exactly the number of times equal to the number of set bits. Your original loop will iterate a number of times equal to the bit position of the highest set bit.

However, I'd be astonished if this made any measurable difference unless this was a super-critical bit of code.

Michael Burr 2009-08-20 17:15:11

Answer 4

+1 A:

It kind of depends on what you expect the distribution of (oldBuffer ^ newBuffer) will be. If it's totally random and the full range of 32 bits, then you have an average of 16 loops.

One possible solution is to make a table like this

int lookup[255][8] = {
  { -1, -1, -1, -1, -1, -1, -1, -1 }, // 0 has no bits set
  {  0, -1, -1, -1, -1, -1, -1, -1 }, // 1 has only the 0th bit set
  {  1, -1, -1, -1, -1, -1, -1, -1 }, // 2 has only the 1st bit set
  {  0,  1, -1, -1, -1, -1, -1, -1 }, // 3 has the 0th, 1st bit set
  {  2, -1, -1, -1, -1, -1, -1, -1 }, // 4 has only the 2nd bit set
  ...
  {  0,  1,  2,  3,  4,  5,  6,  7 }, // 255 has all bits set
}

With this you have to loop 4 times (1 for each byte) and then 1 time for each bit that is set (average of 4) -- Hey, that's 16.

But if the number of set bits is low (averages a lot less than half the 32 bits), then the table lookup will go down.

Unfortunately, table lookup adds a multiply and add each time you use it, so it's not necessarily good. You have to test it. In other words, it finds the set bits in constant time, but the constant might be larger than the loop. It depends on how many set bits you expect to have.

Lou Franco 2009-08-20 17:18:40

Definitely an interesting approach. Now maybe I'ld throw some code at it to generate the lookup table, because I'm not going to be the one filling in the dots in the middle :)

xtofl 2009-08-20 18:10:52

Definitely -- generate the table. It's not only tedious, but error-prone.

Lou Franco 2009-08-20 19:09:27

Was about to suggest this...nice one

Justicle 2009-08-24 06:01:20

Answer 5

+2 A:

How about using the standard library? No need to shift, or, and etc... to test for bits to be true. Testing for bits in a bitset is guaranteed to be constant time. And it writes a lot cleaner and more understandeable.

const std::bitset<32> oldbits( oldBuffer );
const std::bitset<32> newbits ( newBuffer );

for( size_t index = 0; index != oldbits.size(); ++index ) {
   if( oldbits[ index ] != newbits[ index ] ) {
       processNumber( index, newbits[ index ] )
   }
}

Note: you don't need the XOR here, either, since you have direct access to the bits. You may use it, though, to save performance.

xtofl 2009-08-20 17:49:16

Answer 6

A:

You can do it recursive B-Tree-like :)

go(oldBuffer ^ newBuffer, 16, 0, newBuffer);
...

void go(temp, half, pos, bitValue)
{
    if (half > 1) {
      uint32_t mask = (1 << half) - 1;
      if (temp & mask)    go(temp & mask, half/2, pos, bitValue & mask);
      temp >>= half;
      if (temp & mask)    go(temp & mask, half/2, pos + half, (bitValue >> half) & mask);
    } else {
      if (temp & 1) processNumber(pos, bitValue&1);
      if (temp & 2) processNumber(pos+1, bitValue/2&1);
    }
}

zxcat 2009-08-24 05:42:18

ansaurus

tags:

views:

answers:

Help me improve this C++ bit-buffer processing code

related questions