views:

337

answers:

5

I am using a C# implementation of Mersenne Twister I downloaded from CenterSpace. I have two problems with it:

  1. No matter how I seed the algorithm it does not pass DieHard tests, and by that I mean I get quite a lot of 1s and 0s for p-value. Also my KStest on 269 p-values is 0. Well, I cannot quite interpret p-value, but I think a few 1s and 0s in the result is bad news.
  2. I have been asked to visually show the randomness of the numbers. So I plot the numbers as they are generated, and this does not seem random at all. Here is two screenshots of the result after a few seconds and a few seconds later. As you can see in the second screenshot the numbers fall on some parallel lines. I have tried different algorithms to map numbers to points. They all result in parallel lines, but with different angles! This is how I mapped numbers to points for these screenshots: new Point(number % _canvasWidth, number % _canvasHeight). As you may guess, the visual result depends on the form's width and height, and this is a disasterous result.

Here is a few ways I tried to seed the algorithm:

  1. User entry. I enter some numbers to seed the algorithm as an int array.
  2. Random numbers generated by the algorithm itself!!
  3. An array of new Guid().GetHashCode()

What am I missing here? How should I seed the algorithm? How can I get it pass the DieHard?

+2  A: 

While I cannot speak to your first point, the second problem has to do with how you are computing the points to draw on. Specifically,

x = number % _canvasWidth;
y = number % _canvasHeight;

will give you a "pattern" that corresponds somewhat to the aspect ratio of the window you are drawing to. For example, if _canvasWidth and _canvasHeight were equal, you would always draw on a single diagonal line as x and y would always be the same. This graphical representation wouldn't be appropriate in this case, then.

What about taking the N bits of the RNG output and using half for the x coordinate and the other half for the y coordinate? For those bits that fall out of the bounds of your window you might want to consider two options:

  1. Don't draw them (or draw them offscreen)
  2. Perform a linear interpolation to map the range of bits to the width/height of your window

Either option should give you a more representative picture of the bits you are getting our of your random number generator. Good luck!

fbrereto
Thanks for the answer fbrereto. Your observation about equal width and height is very true. I do not know how i missed that :)Here is how I started the mapping:var x = (int)(number var y = number x = x % _canvasWidth;y = y % _canvasHeight;This is somehow similar to your solution; however, I think the mod function still skews the result. FYI this mapping caused vertical paralled lines!
Mehdi Khalili
you need to shift your x values down; after the bit mask they will reside in the upper two bytes of your numeric output, which are all going to be big numbers that could skew your mod.
fbrereto
mod will likely always show some bias when you plot the results to the screen. With what you are trying to do, I would recommend not drawing big coordinates (those outside the window) or linear interpolation.
fbrereto
I have done the shift too; but still the coordinates fall far out of the visible area. If we say that average of left two bytes of generated int32 numbers is 0x7FFF it is still so far beyond what I can show in a display with 1280*1024 resolution; and thus the mod.
Mehdi Khalili
You should scale the values from the rng range to the screen range... That's what I meant by linear interpolation. That way every value has a reasonable screen pixel location.
fbrereto
Thid did the trick:var x = number >> 16;var y = number x = (x*_canvasWidth)/0x7fff;y = (y*_canvasHeight)/0xffff;thanks fbrereto
Mehdi Khalili
A: 

True random number generation cannot be done with a mathematical function. If it's important to have truly random numbers, get a hardware random number generator. I've developed real money online poker games—such hardware is the only way to be confident there are no patterns in the numbers.

If targeting a Linux environment, the /dev/random and /dev/urandom pseudo devices do a lot better than a mathematical generator, since they incorporate random numbers representing hardware activity.

wallyk
Thanks for the response. There are so many online gaming sites that are using Mersenne Twister for RNG:http://practice.galewindsoftware.ca/demo/casino/help/certified/RNG_Certification_Galewind_Software.pdfandhttp://www.pkr.com/en/support/licensing-and-integrity/monthly-certificates/
Mehdi Khalili
-1: This is FUD. The Mersenne twister algorithm produces extremely good random numbers. It's the default random number algorithm for Python. Although technically pseudo-random, it has a ridiculous repeat period - larger than the number of particles in the observable universe. Your argument is unfounded.
ire_and_curses
Would you bet money on, say, a blackjack website which used some implementation of the Mersenne Twister? Whether an implementation can be shown correct and honest is much more work than just plugging in a random number generator.
wallyk
I am not really sure about blackjack and complicated casino games; but based on what I have seen on the net MT does the job for most online gaming providers. Here is another example:http://www.backgammonmasters.com/itechlabs/LuckySkill_RNG_Recommendation_230109.pdfAlso iTechLab shows MT on top of their accepted algorithms: http://www.itechlabs.com.au/pages/rngtesting.html
Mehdi Khalili
A: 

Your stripy point-plotting problem should easily be fixed by generating a new random number for each of the x and y coordinates. Trying to reuse a single generated number for x and y is basically premature optimization, but if you do go down that route, make sure you extract different bits for each from the number; as is, x=n%width;y=n%height gives you enormous correlation between x and y, as can be seen in your images.

I've been using various C++ Mersenne Twister implementations for years (most recently boost's) to generate random points and had no difficulties with it (seed related or otherwise). It really is a superb generator.

timday
Using two numbers for plotting sounds like a solution for my second problem, but for some reason it makes me feel guilty! I mean to user, and perhaps auditor, each point represents one number!With your experience with MT, how do you usually seed it? Any comment on dieHard?
Mehdi Khalili
A: 

So here is how I finally got the coordinates (thanks to fbrereto's helps):

    Point GetPoint(int number)
    {
        int x = 0;
        int y = 0;

        for (int c = 0; c < 16; c++)
        {
            x = x << 1;
            x += number & 1;
            number = number >> 1;

            y = y << 1;
            y += number & 1;
            number = number >> 1;
        }

        x = (x*_canvasWidth)/0xffff;
        y = (y*_canvasHeight)/0xffff;
        return new Point(x, y);
    }

Breaking the value into two ushort (that is the two right bytes and two left bytes) would work too, but only for huge ranges. When the RNG returns for example a value between 1 and 10000 then either x or y would always be zero. The above approach however captures every other bit for x and y and thus the value is spread between the two.

Can anyone please help on the DieHard result?

Mehdi Khalili
A: 

Alright, the diehard issue was resolved too. It turns out the diehard tests expect unsigned integers!!

Mehdi Khalili