views:

1798

answers:

6

George Marsaglia has written an excellent random number generator that is extremely fast, simple, and has a much higher period than the Mersenne Twister. Here is the code with a description:

good C random number generator

I wanted to port the CMWC4096 code to Java, but it uses several unsigned datatypes so I am not sure how to do this properly. Here is the full C code:

/* choose random initial c<809430660 and */
/* 4096 random 32-bit integers for Q[]   */
static unsigned long Q[4096],c=362436;

unsigned long CMWC4096(void) {
    unsigned long long t, a=18782LL;
    static unsigned long i=4095;
    unsigned long x,r=0xfffffffe;
    i = (i+1) & 4095;
    t = a*Q[i] + c;
    c = (t>>32);
    x = t + c;
    if (x < c) {
        x++;
        c++;
    }
    return (Q[i] = r - x);
}

Can anyone port this to Java? How does this work when you only have signed numbers available?

EDIT: Thanks everybody for the quick answers! For the first 100 million numbers this java code seems to produce the same result as the C code. It is 3 times faster than Java's java.util.Random.

public class ComplimentaryMultiplyWithCarryRandom {

    /**
     * Choose 4096 random 32-bit integers
     */
    private long[] Q;

    /**
     * choose random initial c<809430660
     */
    private long c = 362436;

    private int i;

    public ComplimentaryMultiplyWithCarryRandom() {
        Random r = new Random(1);
        Q = new long[4096];

        // TODO initialize with real random 32bit values
        for (int i = 0; i < 4096; ++i) {
            long v = r.nextInt();
            v -= Integer.MIN_VALUE;
            Q[i] = v;
        }
        i = 4095;
    }

    int next() {
        i = (i + 1) & 4095;
        long t = 18782 * Q[i] + c;
        c = t >>> 32;
        long x = (t + c) & 0xffffffffL;
        if (x < c) {
            ++x;
            ++c;
        }

        long v = 0xfffffffeL - x;
        Q[i] = v;
        return (int) v;
    }
}
+1  A: 

Just as a quick point of reference that may (or may not) help you, I found this link:

http://darksleep.com/player/JavaAndUnsignedTypes.html

Joel Marcey
+1  A: 

You can use signed numbers provided the values don't overflow...for example long in java is a 64 bit signed integer. However the intent in this algorithm seems to be to use a 64 bit unsigned value, and if so I think you would be out of luck with the basic types.

You could use the multiprecision integers provided in the java class libraries (BigInteger). Or you could implement your own 64 bit unsigned type as an Object containing two java longs to represent the least significant and most significant words (but you'd have to implement the basic arithmetic operations yourself in the class).

frankodwyer
Impressively old link. There is no java.lang.Bignum class, I assume that it became java.math.BigInteger before Java 1.1 went final.
Dan Dyer
Whoops - you're right! That shows you how long it is since I needed to use MP ints. I'll fix that now.
frankodwyer
+3  A: 

To get around Java's lack of unsigned types you usually store numbers in a bigger variable type (so shorts get upgraded to ints, ints to long). Since you're using long variables here, you're going to have to step up to BigInteger, which will probably wreck any speed gains that you're getting out of the algorithm.

Bill the Lizard
On the other hand, classes that use the RNG will, of course, also use Java's signed types otherwise, so if the full unsigned long range is required, the RNG will probably not be the only place with a speed problem.
Henning
+5  A: 

If you are implementing an RNG in Java, it is best to sub-class the java.util.Random class and over-ride the protected next(int) method (your RNG is then a drop-in replacement for java.util.Random). The next(int) method is concerned with randomly-generated bits, not what vales those bits might represent. The other (public) methods of java.util.Random use these bits to construct random values of different types.

Dan Dyer
+12  A: 

Can anyone port this to Java? How does this work when you only have signed numbers available?

No Stress! a=18782 so the largest t could ever be is not large enough to cause signed vs. unsigned problems. You would have to "upgrade" the result of using Q to a value equal to a 32-bit unsigned number before using it anywhere. e.g. if Q is an int (32-bit signed) then you'd have to do this before using it in the t=a*Q[i]+c statement, e.g.

t=a*(((long)Q[i])&0xffffffffL)+c

where this (((long)Q[i])&0xffffffffL) business promotes Q[i] to a 64-bit # and ensures its high 32 bits are 0's. (edit: NOTE: you need 0xffffffffL here. Java does the wrong thing if you use 0xffffffff, it seems like it "optimizes" itself to the wrong answer & you get a negative number if Q[i]'s high bit is 1.)

You should be able to verify this by running the algorithms in C++ and Java to compare the outputs.

edit: here's a shot at it. I tried running it in C++ and Java for N=100000; they both match. Apologies if I used bad Java idioms, I'm still fairly new to Java.

C++:

// marsaglia2003.cpp 

#include <stdio.h>
#include <stdlib.h> // for atoi

class m2003
{
    enum {c0=362436, sz=4096, mask=4095};
    unsigned long Q[sz];
    unsigned long c;
    short i;

public:
    m2003()
    {
     // a real program would seed this with a good random seed
     // i'm just putting in something that makes the output interesting
     for (int j = 0; j < sz; ++j)
      Q[j] = j + (j << 16);
     i = 4095;
     c = c0;
    }

    unsigned long next()
    {
     unsigned long long t, a=18782LL;
     unsigned long x;
     unsigned long r=0xfffffffe;
     i = (i+1)&mask;
     t=a*Q[i]+c;
     c=(unsigned long)(t>>32);
     x=(unsigned long)t + c;
     if (x<c)
     {
      x++;
      c++;
     }
     return (Q[i]=r-x);
    }
};

int main(int argc, char *argv[])
{
    m2003 generator;
    int n = 100;
    if (argc > 1)
     n = atoi(argv[1]);

    for (int i = 0; i < n; ++i)
    {
     printf("%08x\n", generator.next());
    }
    return 0;
}

java: (slower than compiled C++ but it matches for N=100000)

// Marsaglia2003.java

import java.util.*;

class Marsaglia2003
{
    final static private int sz=4096;
    final static private int mask=4095;
    final private int[] Q = new int[sz];
    private int c=362436;
    private int i=sz-1;

    public Marsaglia2003()
    {
     // a real program would seed this with a good random seed
     // i'm just putting in something that makes the output interesting
     for (int j = 0; j < sz; ++j)
      Q[j] = j + (j << 16);
    }

  public int next() 
    // note: returns a SIGNED 32-bit number.
    // if you want to use as unsigned, cast to a (long), 
    // then AND it with 0xffffffffL
    {
     long t, a=18782;
     int x;
     int r=0xfffffffe;
     i = (i+1)&mask;
     long Qi = ((long)Q[i]) & 0xffffffffL; // treat as unsigned 32-bit
     t=a*Qi+c;
     c=(int)(t>>32); 
        // because "a" is relatively small this result is also small

     x=((int)t) + c;
     if (x<c && x>=0) // tweak to treat x as unsigned
     {
      x++;
      c++;
     }
     return (Q[i]=r-x);
    }

    public static void main(String args[])
    {
     Marsaglia2003 m2003 = new Marsaglia2003();

     int n = 100;
     if (args.length > 0)
         n = Integer.parseInt(args[0]);
     for (int i = 0; i < n; ++i)
     {
      System.out.printf("%08x\n", m2003.next());
     }
    }
};
Jason S
+10  A: 

Most of the time there is no need to use larger numeric types for simulating unsigned types in Java.

For addition, subtraction, multiplication, shift left, the logical operations, equality and casting to a smaller numeric type it doesn't matter whether the operands are signed or unsigned, the result will be the same regardless, viewed as a bit pattern.

For shifting to the right use >> for signed, >>> for unsigned.

For signed casting to a larger type just do it.

For unsigned casting from a smaller type to a long use & with a mask of type long for the smaller type. E.g., short to long: s & 0xffffL.

For unsigned casting from a smaller type to an int use & with a mask of type int. E.g., byte to int: b & 0xff.

Otherwise do like in the int case and apply a cast on top. E.g., byte to short: (short) (b & 0xff).

For the comparison operators < etc. and division the easiest is to cast to a larger type and do the operation there. But there also exist other options, e.g. do comparisons after adding an appropriate offset.

starblue
Good summary. Although you forgot one critical operation: casting to larger types. (e.g. cast a 32-bit to 64-bit #.) You need to AND the result with a mask so that it interprets the original as "unsigned".
Jason S
starblue