views:

423

answers:

3

Suppose a bit sequence of size M, and another bit sequence of size N, with M >> N. Both M and N can be saved inside integer arrays: If N has a length of 30 then an array with only one integer will be needed, but if N has a length of 300 then an array with 10 integers will be needed to store it.

What I am trying to do is to shift N inside M, and for each possible position k inside M to find the number of differences (by XORing) between N and M(k). If M has 10000 bits and N has 100 bits then there are 10000-100=9900 positions in which an XOR comparison will be performed.

Are you aware of a library that could do that or maybe propose an algorithm ? I know that it can be done with many other ways however I believe that the fastest possible method is the one proposed here. If you can think of a faster way then I'm open to suggestions !

I'd prefer something in C or C++ but other languages, even pseudocode are also acceptable.

Thanks in advance.

+1  A: 

Simple approach:

while N < M * (original N) do
  compute and tally up M xor N
  multiply each word (unsigned) in N by 2,   // i.e. shift left 1 bit
    and add in the carry (= overflow) from the previous word.

Modern CPUs are powerful enough that even for 10,000 and 100 bits this should only take a few milliseconds.

To "compute and tally up M xor N",

sum = 0
for (i=0; i<M/8; i++)
   if M[i] != 0
      w = M[i]
      while w != 0
      if ((w & 1) != 0) sum++   // test LSB
      w /= 2                    // shift right 1 bit

There are lots of ways to optimize this. There are lots of digits that will be 0 most of the time, and you can recognize and ignore these... but the above algorithm should get you started.

Carl Smotricz
I am having difficulties in understanding your algorithm! In "compute and tally up M xor N", where do you use N ? When shiftin N left wouldn't there be problems with the integer limits ?Also, what is the condition N < M * (original N) ?
Serafeim
I was at work and hoping to fire off a quick but useful response before I got interrupted. Sorry if my hints were a bit too vague. I'm now going back to the drawing board for a much more detailed response. Back in a few!
Carl Smotricz
+1  A: 

Here's a complete and working solution. Minor sloppiness left as an exercise for the reader :)

#include "stdio.h"
#include "stdlib.h"
#include "string.h"

#define M_SIZE 100
#define N_SIZE 25
#define bitsToBytes(n) ((n + 7)/8)

typedef unsigned char byte;

void dumpBytes(byte *arr, size_t size) {
   int b;
   for (b=0; b<size; b++) {
      printf("%02x", *arr++);
   }
   printf("\n");
}

int main(int argc, char *argv[]) {

   byte M[bitsToBytes(M_SIZE)], N[bitsToBytes(N_SIZE)];

   /* Fill M and N with random bits */

   int b;

   for (b=0; b<sizeof(M); b++) {
      M[b] = 0xff & rand();
   }
   for (b=0; b<sizeof(N); b++) {
      N[b] = 0xff & rand();
   }

   /* Create a couple of arrays big enough for M_SIZE + N_SIZE, 
      to allow shifting N all the way before the left of M. */
   #define MN_SIZE (M_SIZE + N_SIZE)
   #define MN_BYTES (bitsToBytes(MN_SIZE))
   byte MM[MN_BYTES], NN[MN_BYTES];

   /* Zero out MM, NN, then copy M, N there (right justified). */
   int offset = sizeof(MM) - sizeof(M);
   memset (MM, 0, sizeof(MM)); memcpy(MM+offset, M, sizeof(M));
   offset = sizeof(NN) - sizeof(N);
   memset (NN, 0, sizeof(NN)); memcpy(NN+offset, N, sizeof(N));

   dumpBytes(MM, MN_BYTES);
   /* Count up "difference bits" until NN has been left-shifted into oblivion. */
   int s;
   for (s=0; s<MN_SIZE; s++) {
      int sum = 0;
      for (b=0; b<MN_BYTES; b++) {
  int xor = MM[b] ^ NN[b];
  while (xor != 0) {
     sum += (xor & 1);
     xor >>= 1;
  }
      }
      dumpBytes(NN, MN_BYTES);
      printf("Shift: %4d; bits: %3d.\n", s, sum);
      /* shift NN one bit to the left */
      for (b=0; b<MN_BYTES; b++) {
  NN[b] <<= 1;
  if (b < (MN_BYTES - 1) && ((NN[b+1] & 0x80) != 0)) NN[b] |= 1;
      }
   }
}
Carl Smotricz
The algorithm is very simplistic; for 10000 and 100 bits, it ran in 0.48 seconds for me.
Carl Smotricz
+1  A: 

You can either shift N up over M or shift M down into N. When shifting N you'll also need to shift the mask if the input doesn't align with word size. The shifts could be cached into an array with length of word size in bits, but given that shifting across multiple words is 1 instruction per word (if you use the RCR instruction) it might not be worth the risk of thrashing the L1 cache.

Unless you can count on a Core i7 processor with the POPCNT instruction the most significant part will be the bit counting anyway. See this page for fast implementations of bit counting.

For smaller lengths of N (in machine words) you'll get large increases of speed by special casing the inner loop. For N <= 192 bits on a processor with SSE4.2 it should be able to run the two innermost loops with keeping everything in registers. My rusty ASM shows me 14 live registers with innermost loop (of shifting over 64 bit locations) length of 20 instructions with another 5 for reading in the next word from the input).

Ants Aasma