I was asked this question in a job interview. The interviewer and I disagreed on what the correct answer was. I'm wondering if anyone has any data on this.
Update: I should have mentioned that the use of shuffle() was strictly forbidden... sorry.
I was asked this question in a job interview. The interviewer and I disagreed on what the correct answer was. I'm wondering if anyone has any data on this.
Update: I should have mentioned that the use of shuffle() was strictly forbidden... sorry.
shuffle($arr); :)
edit: I should clarify... my definition of best involves not just algorithm efficiency but code readability and maintainability as well. Using standard library functions means maintaining less code and reading much less too. Beyond that, you can get into year-long debates with PhD professors about the best "true random" function, so somebody will always disagree with you on randomization questions.
PHP has a built in function --> shuffle() . I would say that should do what you like, but it mostly likely will be anything but totally 'random'.
Check http://computer.howstuffworks.com/question697.htm for a little description of why its very, very difficult to get complete randomness form a computer.
Short Answer: PHP's array_rand() function
Given that the use of the shuffle function is forbidden, I would use $keys = array_rand($myArray, count($myArray)) to return an array of the keys from $myArray in random order. From there it should be simple to reassemble them into a new array that has been randomized. Something like:
$keys = array_rand($myArray, count($myArray));
$newArray = array();
foreach ($keys as $key) {
$newArray[$key] = $myArray[$key];
}
He's probably testing you on a relatively common mistake most people make when implementing a shuffling algorithm (this was also actually at the center of a controversy involving an online poker site a few years back)
Incorrect way to shuffle:
for (i is 1 to n)
Swap i with random position between 1 and n
Correct way to shuffle:
for (i is 1 to n)
Swap i with random position between i and n
Graph out the probability distribution for these cases and it's easy to see why the first solution is incorrect.
Well here's the solution I came up with:
function randomize_array_1($array_to_randomize) {
$new_array = array();
while (count($array_to_randomize) > 0) {
$rand_num = rand(0, count($array_to_randomize)-1);
$extracted = array_splice($array_to_randomize, $rand_num, 1);
$new_array[] = $extracted[0];
}
return $new_array;
}
And here's his solution:
function randomize_array_2($array_to_randomize) {
usort($array_to_randomize, "rand_sort");
return $array_to_randomize;
}
function rand_sort($a, $b) {
return rand(-1, 1);
}
I ran a bunch of trials on both methods (trying each 1,000,000 times) and the speed difference was negligible. However, upon checking the actual randomness of the results I was surprised at how different the distributions were. Here's my results:
randomize_array_1:
[2, 3, 1] => 166855
[2, 1, 3] => 166692
[1, 2, 3] => 166690
[3, 1, 2] => 166396
[3, 2, 1] => 166629
[1, 3, 2] => 166738
randomize_array_2:
[1, 3, 2] => 147781
[3, 1, 2] => 73972
[3, 2, 1] => 445004
[1, 2, 3] => 259406
[2, 3, 1] => 49222
[2, 1, 3] => 24615
As you can see, the first method provides an almost perfect distribution indicating that it's being more-or-less truly random, whereas the second method is all over the place.