views:

635

answers:

4

I need to generate a large list of random numbers from 600k to 2000k, but the list can not have duplicates.

My current 'implementation' looks like this:

<?php
    header('Content-type: text/plain');
    $startTime = microtime(true);
    $used = array();
    for ($i=0; $i < 600000; ) { 
        $random = mt_rand();
        //if (!in_array($random, $used)) {
        $used[] = $random;
        $i++;
        //}
    }
    $endTime = microtime(true);
    $runningTime = $endTime - $startTime;
    echo 'Running Time: ' . $runningTime;
    //print_r($used);
?>

If I keep the in_array test commented the processing time is around 1 second, so the mt_rand calls and the used array filling are relatively 'cheap' but when I uncomment the in_array test bad things happens! (I'm just waiting -it's been more then 10 minutes- for the script to terminate...)

So I'm looking for alternatives either on the duplicate detection side or in the generation part (How could i generate random numbers without the risk of getting duplicates)

I'm open to any suggestion.

+8  A: 

For a quick/dirty solution, does using/checking array keys improve your speed at all?

$used = array();
for ($i = 0; $i < 600000; ) { 
    $random = mt_rand();
    if (!isset($used[$random])) {
        $used[$random] = $random;
        $i++;
    }
}
$used = array_values($used);
Matt Huggins
Thanks! The difference in running time is very big! Even when running the loop 2000k times. It's lightning fast!
Cesar
+1  A: 

If you do the looping anyways and if you don't need more than 600000 why would you check them at all, why not just append $i to $random. done. not random enough?

for ($i = 0; $i < 600000; $i++)
{
    $yourArray[] = mt_rand() . $i; 
}

Furthermore there is the array function array_unique, which removes duplicate values from an array.

tharkun
+1  A: 

in_array requires to search the whole array in the worst case, that means linear costs (O(n)). But using the array key as – well – the key, the costs are constant (O(1)) since the costs for array access is always constant.

Gumbo
+1  A: 

You could for example do something like this instead

$random = mt_rand();

$array = range($random, $random + 600000);

$array = shuffle($array);

That would create a array that first is in order, but then it shuffles the array, so the values will be random. No collisions! :D

Artheus