ansaurus

Question

Finding N contiguous zero bits in an integer to the left of the MSB position of another integer

Answer 1

+1 A:

This http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious has several ways to calcuate the unsigned integer log base 2 of an unsigned integer (which is also the position of the highest bit set).

I think that this is part of what you want. I suspect that if I really knew what you want I could suggest a better way of calculating it or something that served the same purpose.

nategoose 2010-05-07 23:04:14

I've edited the question several thousand times now and hopefully the code comments, the rewording of the question, and the inclusion of a (slightly) better test case will contribute to your better understanding of the problem - which, I'm starting to believe I'll get zero answers on and have wasted several hours trying to clarify something which already works!

James Morris 2010-05-08 01:23:23

Answer 2

+1 A:

count_leading_zero_bits is often a single instruction that the compiler will provide an inline function for. otherwise put it in a loop.

count_trailing_zero_bits can use count_leading_zero_bits(x&-x) or a debruijn lookup if the former is a loop.

for simplicity I assume 32 bit values.

int offset_of_zero_bits_over_msb_of_other_value( unsigned width , unsigned value , unsigned other ) {
  int count = 0;
  int offset = -1;
  int last = 1;
  int lz = count_leading_zero_bits( other );
  other |= ((1<<(32-lz2))-1); // set all bits below msb
  if ( value & ~other ) {
    value |= other; // set all bits below msb of other
    value = ~value; // invert so zeros are ones
    while ( value && count < width ) {
      count += 1; // the widest run of zeros
      last = value; // for counting trailing zeros
      value &= value >> 1; // clear leading ones from groups
    }
    offset = count_trailing_zero_bits( last );
  } else {
    count = lz2;
    offset = 32 - lz2;
  }
  return ( count < width ) ? -1 : offset;
}

The idea behind the code is this:

  val1:  00000000000000000000000010000000 ( 128 )
  val2:  01000001000100000000100100001001 ( 1091569929 )
  lz1:   24
  lz2:   1
  val2:  01000001000100000000100011111111 // |= ((1<<(32-lz1))-1);
  val2:  10111110111011111111011100000000 // = ~val2
  val2:  00011110011001111111001100000000 // &= val2>>1 , count = 1
  val2:  00001110001000111111000100000000 // &= val2>>1 , count = 2
  val2:  00000110000000011111000000000000 // &= val2>>1 , count = 3
  val2:  00000010000000001111000000000000 // &= val2>>1 , count = 4
  val2:  00000000000000000111000000000000 // &= val2>>1 , count = 5
  val2:  00000000000000000011000000000000 // &= val2>>1 , count = 6
  val2:  00000000000000000001000000000000 // &= val2>>1 , count = 7
  val2:  00000000000000000000000000000000 // &= val2>>1 , count = 8

So at each step, all ranges of zeros, now ones, are shrunken from the right. When the value is zero, the number of steps taken is the width of the widest run. At any point, counting the number of trailing zeros will give the offset to the nearest range of at least count zeros.

If at any point count exceeds width, you can stop iterating. The maximum number of iteration is thus width, not the word size. You could make this O(log n) of width, because you can double the shift amount at each iteration as long as you do not exceed width.

Here is a DeBruijn lookup for counting trailing zero bits for 32 bit values.

static const int MultiplyDeBruijnBitPosition[32] = {
  0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 
  31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
r = MultiplyDeBruijnBitPosition[((uint32_t)((v & -v) * 0x077CB531U)) >> 27];

I noticed that in both your examples, val1 had only a single bit set. If that is the case, you can use the DeBruijn trick to find the MSB.

drawnonward 2010-05-08 04:23:23

this code gives very different results to my code - for simplicity i just replaced each call to `count_leading_zero_bits` with the 2 line MSB finding `while` loop from mine. am i right in assuming `value` relates to `var1` in and `other` relates to `var2` from my code? also, can you state if this finds regions of zeros contained within set bits?

James Morris 2010-05-08 18:20:40

Is your msb loop destructive? In your original posting it shifts the input value. I added what the expected value would be at each iteration of the loop.

drawnonward 2010-05-09 02:36:17

val1 having only a single bit set is an artificiality imposed to make finding regions of zero bits more likely.

James Morris 2010-05-09 09:23:21

+1 for effort, and, causing me to think hard enough to come to a solution. Thanks.

James Morris 2010-05-09 16:13:52

I have always liked bit manipulation. What did you come up with as a final solution?

drawnonward 2010-05-09 16:47:30

Answer 3

A:

Here is my new and improved algorithm:

int test_fit_within_left_of_msb(  unsigned width,
                                  unsigned val1,
                                  unsigned val2 )
{
    int offset = 32;
    int msb = 0;
    unsigned mask;
    unsigned b;

    msb = 32 - __builtin_clz(val1); /* GCC builtin to count Leading Zeros */

    while(offset - width > msb)
    {
        mask = (((unsigned)1 << width) - 1) << (offset - width);
        b = val2 & mask;

        if (!b)
            return 32 - offset;

        offset = __builtin_ctz(b); /* GCC builtin to Count Trailing Zeros */
    }

    return -1;
}

This code has a lot of improvements over my initial implementation. Primarily, the removal of the inner while loop by simply counting the trailing zero bits. Secondly, I have also made the algorithm work with an offset that uses the natural bit position values and thus removed some of the addition and subtraction operations my original used, until the successful return statement. You could nit pick over subtracting the offset from 32.

The important point here in the code, is the algorithm - I realize there are portability concerns and assumptions made about types and there sizes. Looking back up the page to the output where width 8 can be found at position 12 performed in 10 iterations, the new alogirthm does the same in 2 iterations of the loop.

I've used the GCC builtins for convenience here, the MultiplyDeBruijnBitPosition code that drawonward provided ( from: http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup ) can be used to replace __builtin_ctz, while __bultin_clz could be replaced with one of the integer log base 2 codes from the same page.

One concern here though, is the data (with sparsely set bits) I've used to test this with makes this algorithm perform better, this will maybe not be so great looking at integers with more densely set bits. (Not correct - by counting the trailing zeros it avoids this bad case).

James Morris 2010-05-09 14:09:50

Answer 4

A:

After implementing my previous answer but to work for the right of the MSB, I saw that aside from a very minor difference, the left and right versions were exactly the same. This lead to the realization there is no real requirement for the algorithm to be working with an MSB from some prior value at all.

So although this answer does not meet the specifications of the question, it is the correct answer because the specification was incorrect.

#include<stdint.h>

/* returns bit position within a 32bit integer, where
   a region of contiguous zero bits can be found whose
   count is equal to or greater than width. it returns
   -1 on failure.
*/

int binary_width_fit( unsigned width, uint32_t val )
{
    int offset = 32;
    uint32_t mask;
    uint32_t b;

    while(offset >= width)
    {
        mask = (((uint32_t)1 << width) - 1) << (offset - width);
        b = val & mask;
        if (!b)
            return offset;
        offset = __builtin_ctz(b); /* GCC builtin to Count Trailing Zeros */
    }
    return -1;
}

James Morris 2010-05-09 16:02:57

This is not so great with small widths. Improved version uses GCC builtin to count the leading ones before applying the mask to further reduce the loop's iteration count.

James Morris 2010-05-10 00:34:59

I wish you had put that comment in the original question.

nategoose 2010-05-10 15:57:32

@nategoose: do you mean the comment about there being no real requirement for working within the left of some MSB rather than the whole range of bits? If so, I can add it as an edit, and if you come up with a better solution I'll be happy to accept it if it is indeed better... When I asked the question I'd not figured everything out, it seems like it's taken quite a while to get to this stage, where I realize constraining the process to one side of the MSB (etc).

James Morris 2010-05-10 18:26:21

@James Morris: The in code comment was what explained what you wanted to me, even though you swapped which end of the word you were working from. BTW, you could do `int offset = 8*sizeof(unsigned long);` and make this take advantage of larger native word sizes and be portable, if having more bits is useful.

nategoose 2010-05-10 21:24:40

Answer 5

A:

1 (fast) method is to use precalulated LOOKUP TABLES (LUTs) for each 8bit byte:

PosOfFirst1, PosOfLast1, PosOfFirst0, PosOfLast0 -- all arrays of 256 bytes

Precalculate the tables using: (soz for poor, pascalish pseudocode)

PosOfLast1:

FOR EACH ByteVal (0..255):

if byteVal>127 return 8 elseif byteVal>63 return 7 ... elseif byteVal>0 return 1 else return 0

PosOfFirst1:

c:=0; while c<8 do begin bv = byteVal and 1; if bv=1 then return c else byteval shr 1;
inc (c); end;

I use simple assembler procs for these algs. PosOfFirst0 and PosOfLast0 LUTs can be pre-calced using these 2 tables also - as can TRAILING & LEADING 0 (or 1) count. It's useful to pre-calc 'minus 1' versions of these tables too....

You can then use (for 8bit bytes) var InputByte: Byte; FirstBit:=PosOfFirst1[InputByte] // v.fast

For larger sizes (0, 16, 24, 32 +++++) use procs and loops that check each constituent 8bit byte. Memory access to the LUT may be needed but this method is still faster:

a) Can be used easily without needing a procedure call. b) scanning a 32 bit number takes 1 shift & comparison to 0 per byte with 1 lookup needed (if a non-zero byte is found) instead of n (0..32) shifts, ands and comparisons... c) if programmed well will halt after finding 1st/last 1

The LUT principle applies to 'population count' + other bit manip. routines...

Cheers, PrivateSi

FASTER IS BETTER?!

PrivateSi 2010-08-08 13:55:59

ansaurus

tags:

views:

answers:

Finding N contiguous zero bits in an integer to the left of the MSB position of another integer

related questions