views:

658

answers:

5

Multiplication of two n-bit numbers A and B can be understood as a sum of shifts:

 (A << i1) + (A << i2) + ...

where i1, i2, ... are numbers of bits that are set to 1 in B.

Now lets replace PLUS with OR to get new operation I actually need:

 (A << i1) | (A << i2) | ...

This operation is quite similar to regular multiplication for which there exists many faster algorithms (Schönhage-Strassen for example). Is a similar algorithm for operation I presented here?

The size of the numbers is 6000 bits.

edit: For some reason I have no link/button to post comments (any idea why?) so I will edit my question insead. I indeed search for faster than O(n^2) algorithm for the operation defined above. And yes, I am aware that it is not ordinary multiplication.

+1  A: 
Jason S
A: 

I presume, you are asking the name for the additive technique you have given
when you write "Is a similar algorithm for operation I presented here?"...

Have you looked at the Peasant multiplication technique?
Please read up the Wikipedia description if you do not get the 3rd column in this example.

 B X  A
27 X  15  : 1
13    30  : 1
 6    60  : 0
 3   120  : 1
 1   240  : 1

B is 27 == binary form 11011b

27x15 = 15 + 30 + 120 + 240
      = 15<<0 + 15<<1 + 15<<3 + 15<<4
      = 405

Sounds familiar?


Here is your algorithm.

  1. Choose the smaller number as your A
  2. Initialize C as your result area
  3. while B is not zero,
    1. if lsb of B is 1, add A to C
    2. left shift A once
    3. right shift B once
  4. C has your multiplication result (unless you rolled over sizeof C)


Update If you are trying to get a fast algorithm for the shift and OR operation across 6000 bits,
there might actually be one. I'll think a little more on that.

It would appear like 'blurring' one number over the other. Interesting.
A rather crude example here,

110000011 X 1010101 would look like
      110000011
    110000011
  110000011
110000011
---------------
111111111111111

The number of 1s in the two numbers will decide the amount of blurring towards a number with all its bits set.
Wonder what you want to do with it...


Update2 This is the nature of the shift+OR operation with two 6000 bit numbers.

  1. The result will be 12000 bits of course
  2. the operation can be done with two bit streams; but, need not be done to its entirety
  3. the 'middle' part of the 12000 bit stream will almost certainly be all 1s (provided both numbers are non-zero)
  4. the problem will be in identifying the depth to which we need to process this operation to get both ends of the 12000 bit stream
  5. the pattern at the two ends of the stream will depend on the largest consecutive 1s present in both the numbers

I have not yet got to a clean algorithm for this yet. Have updated for anyone else wanting to recheck or go further from here. Also, describing the need for such an operation might motivate further interest :-)

nik
Peasant multiplication is a "regular" algorithm for multiplication, it's still O(n^2) for bignum arithmetic where n is the # of bits.
Jason S
Hmmm, i read your comment in the question now. I did not interpret his question that way. But let me understand this better. If you are doing an OR operation, with shifts can you get any more optimal than the basic shift-OR loop?
nik
@Lukasz, can you elaborate your question a bit more?
nik
"If you are doing an OR operation, with shifts can you get any more optimal than the basic shift-OR loop?" -- that's exactly his question. The issue is that for large numbers (e.g. 6000 bits), you can't hold them all in the processor's ALU at one time, you have to deal with them one word at a time. Even x << 1 is expensive for a 6000-bit number. So the "dumb" (but simple and reliable) shift-OR loop is O(n^2). The fast multiplication algorithms get around this using lots of theory to achieve O(n log n) or thereabouts.
Jason S
er... I stand corrected: "with shifts can you get any more optimal than the basic shift-OR loop?" -- almost certainly not, at least not with shifts. Again, shifts are cheap for small numbers, they are expensive for large numbers. Something else with more sophisticated math is needed to beat O(n^2).
Jason S
A: 

You can do this O(#1-bits in A * #1-bits in B).

a-bitnums = set(x : ((1<<x) & A) != 0)
b-bitnums = set(x : ((1<<x) & B) != 0)

c-set = 0
for a-bit in a-bitnums:
  for b-bit in b-bitnums:
    c-set |= 1 << (a-bit + b-bit)

This might be worthwhile if A and B are sparse in the number of 1 bits present.

themis
A: 

The best I could up with is to use a fast out on the looping logic. Combined with the possibility of using the Non-Zero approach as described by themis, you can answer you question by inspecting less than 2% of the N^2 problem.

Below is some code that gives the timing for numbers that are between 80% and 99% zero. When the numbers get around 88% zero, using themis' approach switches to being better (was not coded in the sample below, though).

This is not a highly theoretical solution, but it is practical.

OK, here is some "theory" of the problem space:

Basically, each bit for X (the output) is the OR summation of the bits on the diagonal of a grid constructed by having the bits of A along the top (MSB to LSB left to right) and the bits of B along the side (MSB to LSB from top to bottom). Since the bit of X is 1 if any on the diagonal is 1, you can perform an early out on the cell traversal.

The code below does this and shows that even for numbers that are ~87% zero, you only have to check ~2% of the cells. For more dense (more 1's) numbers, that percentage drops even more.

In other words, I would not worry about tricky algorithms and just do some efficient logic checking. I think the trick is to look at the bits of your output as the diagonals of the grid as opposed to the bits of A shift-OR with the bits of B. The trickiest thing is this case is keeping track of the bits you can look at in A and B and how to index the bits properly.

Hopefully this makes sense. Let me know if I need to explain this a bit further (or if you find any problems with this approach).

NOTE: If we knew your problem space a bit better, we could optimize the algorithm accordingly. If your numbers are mostly non-zero, then this approach is better than themis since his would result is more computations and storage space needed (sizeof(int) * NNZ).

NOTE 2: This assumes the data is basically bits, and I am using .NET's BitArray to store and access the data. I don't think this would cause any major headaches when translated to other languages. The basic idea still applies.

using System;
using System.Collections;

namespace BigIntegerOr
{
    class Program
    {
        private static Random r = new Random();

        private static BitArray WeightedToZeroes(int size, double pctZero, out int nnz)
        {
            nnz = 0;
            BitArray ba = new BitArray(size);
            for (int i = 0; i < size; i++)
            {
                ba[i] = (r.NextDouble() < pctZero) ? false : true;
                if (ba[i]) nnz++;
            }
            return ba;
        }

        static void Main(string[] args)
        {
            // make sure there are enough bytes to hold the 6000 bits
            int size = (6000 + 7) / 8;
            int bits = size * 8;

            Console.WriteLine("PCT ZERO\tSECONDS\t\tPCT CELLS\tTOTAL CELLS\tNNZ APPROACH");
            for (double pctZero = 0.8; pctZero < 1.0; pctZero += 0.01)
            {
                // fill the "BigInts"
                int nnzA, nnzB;
                BitArray a = WeightedToZeroes(bits, pctZero, out nnzA);
                BitArray b = WeightedToZeroes(bits, pctZero, out nnzB);

                // this is the answer "BigInt" that is at most twice the size minus 1
                int xSize = bits * 2 - 1;
                BitArray x = new BitArray(xSize);

                int LSB, MSB;
                LSB = MSB = bits - 1;

                // stats
                long cells = 0;
                DateTime start = DateTime.Now;

                for (int i = 0; i < xSize; i++)
                {
                    // compare using the diagonals
                    for (int bit = LSB; bit < MSB; bit++)
                    {
                        cells++;
                        x[i] |= (b[MSB - bit] && a[bit]);
                        if (x[i]) break;
                    }

                    // update the window over the bits
                    if (LSB == 0)
                    {
                        MSB--;
                    }
                    else
                    {
                        LSB--;
                    }

                    //Console.Write(".");
                }

                // stats
                TimeSpan elapsed = DateTime.Now.Subtract(start);
                double pctCells = (cells * 100.0) / (bits * bits);
                Console.WriteLine(pctZero.ToString("p") + "\t\t" +elapsed.TotalSeconds.ToString("00.000") + "\t\t" +
                    pctCells.ToString("00.00") + "\t\t" + cells.ToString("00000000") + "\t" + (nnzA * nnzB).ToString("00000000"));
            }

            Console.ReadLine();
        }
    }
}
Erich Mirabal
A: 

Just use any FFT Polynomial Multiplication Algorithm and transform all resulting coefficients that are greater than or equal 1 into 1.

Example:

10011 * 10001
[1 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0] * [1 x^4 + 0 x^3 + 0 x^2 + 0 x^1 + 1 x^0]
== [1 x^8 + 0 x^7 + 0 x^6 + 1 x^5 + 2 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0]
-> [1 x^8 + 0 x^7 + 0 x^6 + 1 x^5 + 1 x^4 + 0 x^3 + 0 x^2 + 1 x^1 + 1 x^0]
-> 100110011

For an example of the algorithm, check:

http://www.cs.pitt.edu/~kirk/cs1501/animations/FFT.html

BTW, it is of linearithmic complexity, i.e., O(n log(n))

Also see:

http://everything2.com/title/Multiplication%2520using%2520the%2520Fast%2520Fourier%2520Transform

e.tadeu
That's what I said in my answer. (your answer is a little more straightforward, mine is more formal)
Jason S