views:

146

answers:

5

I need to compare two numbers and look for similarities in more significant bits. I'm trying to determine the number of least significant bits that differ.

10111000
10111011

184 and 187 require an offset of two, because only two least significant bits differ.

10111011
11111011

187 and 251 require an offset of seven, because the seventh least significant bit differs.

My first idea was to XOR the numbers together, then bit-shift right until the number equaled zero. I feel like there is a better bit-wise solution to this that doesn't involve loops, but I haven't done enough bit-twiddling of my own to come up with it.

The solution needs to work for any 64 bits, as my numbers are being stored as UInt64. This is being written in C#, but the solution is most likely a language agnostic one.


11101101
11010101

Would need an offset of 6 bits. I'm trying to find how many similar bits I can take off the top.

A: 

Something like

floor( log(184 ^ 187) / log(2) ) + 1

No loop, but might not been faster, because log in a costly operation. You should test it, and compare to a simple loop with bit-shifting.

Sometimes a (well-coded) loop is faster than no-loop, especially if you have at most 64 iterations and often less.


More efficient version of my code :

Pre-compute

double Ilog2 = 1 / log(2);

and then each time you need it

floor( log(184 ^ 187) * ILog2 ) + 1
Loïc Février
+1  A: 

Sounds like you've already spotted the main trick; r = x XOR y, then find the highest bit in r. There is a bunch of different way to solve that problem here. The fastest does it in O(n) operations by splitting r in half and checking if the upper part is zero. If you are doing this on a fixed number of bits (you said 64) then unroll the loops to get a series of tests :

pos = 0
r = x XOR y
if r>>32 == 0 :
   r = r & 2^32-1
else
   pos += 32
   r = r>>32
if r>>16 == 0 :
   r = r & 2^16-1
else
   pos += 16
   r = r>16
... etc
Amoss
A: 

You can write a O(log(n)) loop to find the highest set bit pretty easily:

int findHighestSetBit(unsigned long long x) {
    int rv = 0;
    if (x == 0)
        return -1;  // no set bits
    for (int shift = 32; shift > 0; shift >>= 1) {
        if (x >> shift) {
            rv += shift;
            x >>= shift;
        }
    }
    return rv+1; // number least significant bit as '1' rather than '0'
}

if this is too slow, you can manually unroll the loop 5 times.

Chris Dodd
+1  A: 
#include <stdio.h>
#include <stdlib.h>

#define TO_L(s) (strtol((s), NULL, 16))

int tsb(unsigned long xa, unsigned long xb) {
  unsigned long v = xa ^ xb;
  static const unsigned long b[] = {
    0x2, 0xC, 0xF0, 0xFF00, 0xFFFF0000L, 0xFFFFffff00000000L
  };
  static const unsigned int S[]  = { 1, 2, 4, 8, 16, 32 };
  unsigned int r = 0;

#define STEP(i)   \
  if(v & b[i]) {  \
    int t = S[i]; \
    v >>= t;      \
    r  |= t;      \
  }
  STEP(5)
  STEP(4)
  STEP(3)
  STEP(2)
  STEP(1)
  STEP(0)
  return r;
}

int main(int ac, char **av) {
  return printf("%d\n", tsb(TO_L(av[1]), TO_L(av[2]))), 0;
}

I think this implements your algorithm and it's very fast, needing only 6 steps. See this great source of bit twiddling hacks.

so ross$ ./a.out 1f f
4
so ross$ ./a.out 471234abcdabcd 981234abcdabcd
55
so ross$ ./a.out 1deadbeef 7feedface
34
DigitalRoss
A: 

Suppose first you have to do it for 8 bit numbers. the fastest way is a 256 bytes lookup table with precompiled values:

static unsigned char highest_bit_num_LUT[256] = {0, 1, 2, 2, 3, etc }; // precomputed

unsigned diff = (unsigned)a ^ (unsigned)b; // sure you need XOR and not MINUS?
unsigned highest_bit_num = highest_bit_num_LUT[diff & 0xff];

now extending it for higher bit counts:

static unsigned char highest_bit_num_LUT[256] = {0, 1, 2, 2, 3, etc }; // precomputed
unsigned diff = (unsigned)a ^ (unsigned)b; // sure you need XOR and not MINUS?
unsigned highest_bit_num = 0;
for (int i = 7; i >= 0; i--)    
    if (diff >> ( i*8) ){ // found most significant non-zero byte
        highest_bit_num = i*8 + highest_bit_num_LUT[diff >> (i*8)];
        break;
    }

so now we have at most 8 iterations.

EDIT: it would be faster to use DigitalRoss idea for first 3 iterations, and then to use the LUT.

ruslik