views:

887

answers:

6

In 32 bit integer math, basic math operations of add and multiply are computed implicitly mod 2^32, meaning your results will be the lowest order bits of the add or multiply.

If you want to compute the result with a different modulus, you certainly could use any number of BigInt classes in different languages. And for values a,b,c < 2^32 you could compute the intermediate values in 64 bit long ints and use built in % operators to reduce to the right answe

But I've been told that there are special tricks for efficiently computing a*b mod C when C is of the form (2^N)-1 or (2^N)+1, that don't use 64 bit math or a BigInt library and are quite efficient, more so than an arbitrary modulus evaluation, and also properly compute cases which would normally overflow a 32 bit int if you were including the intermediate multiplication.

Unfortunately, despite hearing that such special cases have a fast evaluation method, I haven't actually found a description of the method. "Isn't that in Knuth?" "Isn't that somewhere on Wikipedia?" are the mumblings I've heard.

It apparently is a common technique in random number generators which are doing multiplies of a*b mod 2147483647, since 2147483647 is a prime number equal to 2^31 -1.

So I'll ask the experts. What's this clever special case multiply-with-mod method that I can't find any discussion of?

+2  A: 

A quick search turned up this: http://home.pipeline.com/~hbaker1/AB-mod-N.pdf. Unfortunately, it's too late for me to make enough sense of that to just write in the simplified formula, but it's probably in that paper somewhere.

Jonathan
The paper uses floating point arithmetic rather than the properties of N to make the calculation effective. I tend to get a bit nervous around floating point calculations myself, but haven't checked it any deeper than that... it might work well enough.
Pontus Gagge
Fun paper, worth reading! It's a more general method for arbitrary modulus values. It unfortunately converts the values into 64-bit doubles as part of the computation. This may be a very efficient computation in general, but there's some even faster way for the special c=2^N +-1 cases. +1 upvote anyway just because it's a great link!
SPWorley
See, that's what I get for trying to find things at 4am....
Jonathan
+2  A: 

Suppose you can compute a*b as p*2^N+q. This can require 64-bit computations, or you can split a and b into 16-bit parts and compute on 32-bits.

Then a*b mod 2^N-1 = p+q mod 2^N-1 since 2^N mod 2^N-1 = 1.

And a*b mod 2^N+1 = -p+q mod 2^N+1 since 2^N mod 2^N+1 = -1.

In both cases, there is no division by 2^N-1 or 2^N+1.

Eric Bainville
+1  A: 

Rather than doing modular reduction at each step, you can use Montgomery reduction (there are other descriptions) to reduce the cost of modular multiplication calculation. This still doesn't use the properties of N being plus/minus a power of two, though.

Pontus Gagge
+8  A: 

I think the trick is the following (I'm going to do it in base 10, because it's easier, but the principle should hold)

Suppose you are multiplying a*b mod 10000-1, and

a = 1234 = 12 * 100 + 34
b = 5432 = 54 * 100 + 32

now a*b = 12 * 54 * 10000 + 34 * 54 * 100 + 12 * 32 * 100 + 34 * 32

12 * 54 * 10000 =  648 * 10000
34 * 54 * 100   = 1836 * 100
12 * 32 * 100   =  384 * 100
34 * 32         = 1088

Since x * 10000 = x (mod 10000-1) [1], the first and last terms become 648+1088. The second and third terms are where the 'trick' come in. Note that:

1836 = 18 * 100 + 36
1836 * 100 ≡ 18 * 10000 + 3600 ≡ 3618 (mod 10000-1).

This is essentially a circular shift. Giving the results of 648 + 3618 + 8403 + 1088. And also note that in all cases, the multiplied numbers are <10000 (since a < 100 and b < 100), so this is calculable if you only could multiple 2 digit numbers together, and add them.

In binary, it's going to work out similarly.

Start with a and b, both are 32 bits. Suppose you want to multiply them mod 2^31 - 1, but you only have a 16 bit multiplier (giving 32 bits). The algorithm would be something like this:

 a = 0x12345678
 b = 0xfedbca98
 accumulator = 0
 for (x=0; x<32; x+=16)
     for (y=0; y<32; y+=16)
         // do the multiplication, 16-bit * 16-bit = 32-bit
         temp = ((a>>x)&0xFFFF) * ((b>>y)&0xFFFF)

         // add the bits to the accumulator, shifting over the right amount
         total_bits_shifted = x + y
         for (bits = 0; bits < total_bits_shifted + 32; bits += 31)
             accumulator += (temp>>(bits-total_bits_shifted)) & 0x7FFFFFFF

         // do modulus if it overflows
         if (accumulator > 0x7FFFFFFFF)
             accumulator = (accumulator >> 31) + (accumulator & 0x7FFFFFFF);

It's late, so the accumulator part of that probably won't work. I think in principle it's right though. Someone feel free to edit this to make it right.

Unrolled, this is pretty fast, as well, which is what the PRNG use, I'm guessing.

[1]: x*10000 ≡ x*(9999+1) ≡ 9999*x + x ≡ x (mod 9999)
FryGuy
And still not understanding the math behind this is why I dropped the math minor in college...
Jonathan
Well, it's sort of like getting the remainder when dividing by 9 (10-1). You just add up the digits. Now in this case, instead of base 10, or base 2, you are "base" 2^N
FryGuy
+1  A: 

The identity you're looking for is x mod N = (x mod 2^q)- c*floor(x/2^q) given that N = 2^q + c and c is any integer (but typically +/- 1).

You may want to read section 9.2.3: "Moduli of special form" in "Prime Numbers: A Computational Perspective" by Richard Crandall and Carl Pomerance. Besides theory, it contains pseudocode for an algorithm implementing the above relation.

fredrikj
A: 

I have found a rather extensive page on this very topic, discussing not just the algorithm but even the specific history of the problem and solution and the ways people have used the solution.

SPWorley