views:

555

answers:

3

Joel mentioned counting the number of set bits in a byte as a programming question in his Guerrilla Guide to Interviewing, and talked of a way to take advantage of patterns that occur in the lookup table. I wrote an article about it awhile back after I found the pattern.

To summarize:

Number of bits set in a byte in 16x16
0   1   1   2   1   2   2   3   1   2   2   3   2   3   3   4  
1   2   2   3   2   3   3   4   2   3   3   4   3   4   4   5  
1   2   2   3   2   3   3   4   2   3   3   4   3   4   4   5  
2   3   3   4   3   4   4   5   3   4   4   5   4   5   5   6  
1   2   2   3   2   3   3   4   2   3   3   4   3   4   4   5  
2   3   3   4   3   4   4   5   3   4   4   5   4   5   5   6  
2   3   3   4   3   4   4   5   3   4   4   5   4   5   5   6  
3   4   4   5   4   5   5   6   4   5   5   6   5   6   6   7  
1   2   2   3   2   3   3   4   2   3   3   4   3   4   4   5  
2   3   3   4   3   4   4   5   3   4   4   5   4   5   5   6  
2   3   3   4   3   4   4   5   3   4   4   5   4   5   5   6  
3   4   4   5   4   5   5   6   4   5   5   6   5   6   6   7  
2   3   3   4   3   4   4   5   3   4   4   5   4   5   5   6  
3   4   4   5   4   5   5   6   4   5   5   6   5   6   6   7  
3   4   4   5   4   5   5   6   4   5   5   6   5   6   6   7  
4   5   5   6   5   6   6   7   5   6   6   7   6   7   7   8

The first row and column are exactly the same, and each position in the grid can be calculated by adding the first values in that position's row and column. Because of this, you only need a lookup table with 16 entries for an 8-bit number, and can just use the first 16 numbers. Then, if you wanted to count the set bits in the number 243, for example, you'd just do:

a = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
x = 243 / 16 => 15 # (int)
y = 243 % 16 => 3

a[x] + a[y] => 6

# Are there six bits set in the number 243?
243 = 11110011 # yep

The next pattern I noticed after that was that each time you double the size of the NxN grid, each quadrant could be calculated by adding 0, 1, 1, and 2 to each quadrant, respectively, like so:

# Make a 4x4 grid on the paper, and fill in the upper left quadrant with the values of the 2x2 grid.  
# For each quadrant, add the value from that same quadrant in the 2x2 grid to the array.  

# Upper left quad add 0 to each number from 2x2  
0   1   *   *  
1   2   *   *  
*   *   *   *  
*   *   *   *  

# Upper right quad add 1 to each number from 2×2  
0   1   1   2  
1   2   2   3  
*   *   *   *  
*   *   *   *  

# Lower left quad add 1 to each number from 2×2  
0   1   1   2  
1   2   2   3  
1   2   *   *  
2   3   *   *  

# Lower right quad add 2 to each number from 2×2  
0   1   1   2  
1   2   2   3  
1   2   2   3  
2   3   3   4

Repeat this process two more times, and you'll get the 16x16 grid from above, so I figured there must be some sort of quadtree algorithm that would allow you to start from the grid:

0 1
1 2

and given a number N, generate the lookup table on the fly and figure out the number of bits. So my question/challenge is, can you figure out an algorithm to do just that?

A: 

This is a silly question! In the first example where you've computed the number of bits set using a 16-entry table instead of 256 isn't anything magical! All you've done is count the number of bits set in the first four bits of the byte (first nibble) and then in the second nibble, adding the two together. x/16 is the first nibble, x%16 is the second nibble.

If you repeat the process, now you have a lookup table for two bits and you just do it four times, once for each pair. In the extreme, you can just add all the bits together one-by-one and you get the obvious answer.

The whole point of a lookup table is to avoid the addition.

Eyal
The reason I did it that way is that you only use 2^(N/2) lookup table entries for an N-bit number. For 32-bit numbers, it's the difference between 65,536 and 4,294,967,296 entries. If space is a consideration, it's a valid tradeoff.
Chris Doggett
Fair enough, but it doesn't require analysis of the lookup matrix. It seems easier to realize that you can lookup 8 bits at a time.
Eyal
A: 

Based on Robert's code here, it can even be done without the division or modulus, replacing them with one shift and one AND, like so:

a = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
x = 243 >> 4 # 15 (same as dividing by 16)
y = 243 & 0x0f # 3 ( same as modding by 16)

result = a[x] + a[y] # 6 bits set

Or in C:

const unsigned char oneBits[] = {0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4};

unsigned char CountOnes(unsigned char x)
{
    unsigned char results;
    results = oneBits[x&0x0f];
    results += oneBits[x>>4];
    return results
}

For any size integer, you could just loop through the bytes and do a quick lookup, like so:

def bits(n)
    a = [0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4]
    a[n >> 4] + a[n & 0x0f]
end

def setBits(n)
   total = 0
   while(n > 0)
       total += bits(n&0xff)
       n >>= 8
   end
   total
end

setBits(6432132132165432132132165436265465465653213213265465) # 78 bits set

I'm satisfied with this answer. I knew something more complex and quadtree-esque wouldn't be efficient, I just thought it was a decent thought experiment.

Chris Doggett
A: 

Excuse the late post, but I just found the challenge. My $.02 (brute force)

Private Sub Button1_Click(ByVal sender As System.Object, _
                          ByVal e As System.EventArgs) Handles Button1.Click

    For x As Integer = 0 To 255
        Debug.WriteLine(bitsOn2(CByte(x)) & " " & Convert.ToString(x, 2).PadLeft(8, "0"c))
    Next

End Sub

Private Function bitsOn(ByVal aByte As Byte) As Integer
    Dim aBit As Byte = 1
    For z As Integer = 0 To 7
        If (aByte >> z And aBit) = aBit Then bitsOn += 1
    Next
End Function

Dim aDict As New Dictionary(Of Integer, Integer)
Private Function bitsOn2(ByVal aByte As Byte) As Integer
    If aDict.Count = 0 Then 'init dictionary
        For x As Integer = 0 To 255
            aDict.Add(x, bitsOn(CByte(x)))
        Next
    End If
    Return aDict(aByte)
End Function
dbasnett

related questions