views:

102

answers:

5

Hi, I am trying to randomly choose from e.g. 4 numbers. I need to compare the probability of these 2 algorithms.

1#

                int a = random.Next(0, 4);

                if (a = 0)
                    statement1
                if (a = 1) 
                    statement2
                if (a = 2)
                    statement3
                if (a = 3) 
                    statement4

2#

                int a = random.Next(0, 1000)

                if (a < 250)
                    statement1
                if (a >= 250 && a < 500) 
                    statement2
                if (a >= 500 && a < 750)
                    statement3
                if (a >= 750) 
                    statement4

Am I right if I think that it is the same ? The probability of statement1 in the first code is 1/4 and in the second code it is 250/1000 so it’s 1/4 too. But someone has told me when I use bigger range of random numbers like in code 2# it’s statistically more accurate. I’ve made project which repeats many times those codes, but I’m not sure it shows me some results.

+3  A: 

They are exactly equivalent (except for the fact that the first one won't compile due to using = instead of == in the if-clauses).

To prove this, look at the implementation of Random.Next(int, int). With your values, Random.Next(0, 4) is

(int) (Random.Sample() * 4)

and

Random.Next(0, 1000) is

(int) (Random.Sample() * 1000)

, where Random.Sample() is a private method that returns a random double.

It should now be easy to see that Random.Next(0, 4) will return 0 exactly when Random.Next(0, 1000) will return a number between 0 and 250.

Rasmus Faber
It could compile, but it would certainly not do what you would want it to.
Live
@Live, that is not true in c#. It will not compile and will produce the compiler error: "cannot implicitly convert type 'int' to 'bool'"
Kirk Woll
Great going for the proof approach.
Thomas Ahle
+2  A: 

Pseudorandom numbers should be evenly distributed no matter what the range is. If, in your second example, if you just choose the last 4 bits (a & 3), you will get the same distribution as if you choose the next 4 with (a>>2) & 3. I.e. what you are algorithmically doing in the second example using ranges, is discarding a lot of the information the random generator has given you. You get no more "randomness" with a larger range.

Having said this, pseudorandom generators do have their idiosyncracies, but unless you are serious about this it's not worth worrying about!

Sanjay Manohar
A: 

The distribution is uniform and it's easy to verify:

public class Program
{
    static void Main(string[] args)
    {
        var random = new Random();
        const int iterations = 10000000;

        var hits1 = 1.0 * Enumerable.Range(1, iterations)
                                     .Select(i => random.Next(0, 4))
                                     .Where(i => i == 0).Count();
        Console.WriteLine(hits1 / iterations);

        var hits2 = 1.0 * Enumerable.Range(1, iterations)
                                     .Select(i => random.Next(0, 1000))
                                     .Where(i => i < 250)
                                     .Count();
        Console.WriteLine(hits2 / iterations);
    }
}
Darin Dimitrov
A: 

My tests are as follows

Out of a 10K loop 2 tests was run with a range 1-4 and a range 1-1000, heres the results

1-4

  1 > 2484 times
  2 > 2519 times
  3 > 2511 times
  4 > 2487 times

0 - 1000

  1 - 250    > 2421 times
  250 - 500  > 2531 times
  500 - 750  > 2529 times
  750 - 1000 > 2490 times

my conclusion is that they make no difference what so ever, you have to get into matrix's and so forth to have some control over random number generation and so forth.

Note: my tests was done with PHP and source code is below.


<?php

$first = array(1=>0,2=>0,3=>0,4=>0);
$second = array('0 - 250' => 0, '250 - 500' => 0, '500 - 750' => 0,'750 - 1000' => 0);

for($i=0;$i<=10000;$i++)  //10K
{
    //First
    $f_number = rand(1,4);
    switch($f_number)
    {
        case 1: $first[$f_number]++; break;
        case 2: $first[$f_number]++; break;
        case 3: $first[$f_number]++; break;
        case 4: $first[$f_number]++; break;
    }

    //Second
    $s_number = rand(1,1000);
    if($s_number < 250) $second['0 - 250']++;
    if($s_number > 250 && $s_number < 500) $second['250 - 500']++;
    if($s_number > 500 && $s_number < 750) $second['500 - 750']++;
    if($s_number > 750) $second['750 - 1000']++;
}

var_dump($first,$second);
?>
RobertPitt
-1 It is too large an assumption to say that the implementation of the php random number implementation is identical to that used by C#
Pete Kirkham
A: 

Cozzy is totally true...

Ondí
Do you mean `Boolean cozzy = Boolean.TRUE;` ??
Sanjay Manohar