views:

36

answers:

2

Hi,

Starting with an array with 10K values. I want to randomly get 1000 values from it and put them into another array.

Right now, I am using a for loop to get the values, but I want to pick 1000 values and not have to loop 1000 times. The array_slice function works, but it doesn't give random values. What is the correct (most efficient) function for this task.

The code right now is

$seedkeys = (...array.....);

for ($i=0; $i<1000; $i++) {
        $random = array_rand($seedkeys);  
    $randseed[$i] = $seedkeys[$random];   

}//for close

TIA

A: 

You could use array_rand() to get multiple items ?

$random_keys = array_rand($seedkeys, 1000);
shuffle($random_keys);

This will give you an array of random keys, so to get an array of values you need to do something like this:

$result = array();
foreach ($random_keys as $rand_key) {
    $result[] = $seedkeys[$rand_key];
}

You could instead use array_intersect_key():

$result = array_intersect_key($seedkeys, array_flip($random_keys));
Tom Haigh
Thanks Tom, the array intersect works as advertised, but only when I have a smaller array, the large arrays return funky repeating values/keys due to memory problem (the same problem as my original loop method).
jamex
+1  A: 

Well, there are a few alternatives. I'm not sure which is the fastest since you're dealing with a sizable array, but you may want to try them out:

You can use shuffle, which will randomize the entire array. This will likely have the best performance since you're consuming a significant portion of the array (10%).

suffle($seedkeys);
$result = array_slice($seedkeys, 0, 1000);

You could use array_rand (as you already said) in the manor that Tom Haigh specifies. This will require copying the keys, so if you're dealing with a significant portion of the source array, this may not be the fastest. (Note the use of array_flip, it's needed to allow the usage of array_intersect_key:

$keys = array_flip(array_rand($seedkeys, 1000));
$result = array_intersect_key($seedkeys, $keys);

If memory is tight, the best solution (besides the MySQL one) would be a loop since it doesn't require arrays to be copied at all. Note that this will be slower, but if the array contains a lot of information, it may offset the slowness by being more memory efficient (since it only ever copies exactly what it returns)...

$result = array();
for ($i = 0; $i < 1000; $i++) {
    $result[] = $seedkeys[array_rand($seedkeys)];
}

You could do it in MySQL (assuming that the data for the array starts from MySQL). Be aware this is simple, but not that efficient (See Jan Kneschke's post)...

SELECT * FROM `foo` ORDER BY RAND() LIMIT 1000;
ircmaxell
Thanks maxell, I might have to go with the sql route. The array size is too big, and it causes memory problems, and the loop was only picking a few repeating keys.
jamex