tags:

views:

505

answers:

7

I was asked this question in a job interview. The interviewer and I disagreed on what the correct answer was. I'm wondering if anyone has any data on this.

Update: I should have mentioned that the use of shuffle() was strictly forbidden... sorry.

+2  A: 

shuffle($arr); :)

edit: I should clarify... my definition of best involves not just algorithm efficiency but code readability and maintainability as well. Using standard library functions means maintaining less code and reading much less too. Beyond that, you can get into year-long debates with PhD professors about the best "true random" function, so somebody will always disagree with you on randomization questions.

Loren Segal
A: 

The "correct" way is pretty vague. The best (fastest / easiest / most elegant) to sort an array would be to just use the built-in shuffle() function.

A: 

PHP has a built in function --> shuffle() . I would say that should do what you like, but it mostly likely will be anything but totally 'random'.

Check http://computer.howstuffworks.com/question697.htm for a little description of why its very, very difficult to get complete randomness form a computer.

bryanpearson
+1  A: 

You could use the Fisher-Yates shuffle.

Paul Reiners
I implemented the Fisher-Yates Shuffle real quick and while it didn't provide more random results than some other methods I tried it was significantly faster (about 1/4 the total time). That said, it still took about 4x as long as shuffle().
fixedd
A: 

Short Answer: PHP's array_rand() function

Given that the use of the shuffle function is forbidden, I would use $keys = array_rand($myArray, count($myArray)) to return an array of the keys from $myArray in random order. From there it should be simple to reassemble them into a new array that has been randomized. Something like:

$keys = array_rand($myArray, count($myArray));
$newArray = array();

foreach ($keys as $key) {
$newArray[$key] = $myArray[$key];
}
Scott S.
A: 

He's probably testing you on a relatively common mistake most people make when implementing a shuffling algorithm (this was also actually at the center of a controversy involving an online poker site a few years back)

Incorrect way to shuffle:

for (i is 1 to n) Swap i with random position between 1 and n

Correct way to shuffle:

for (i is 1 to n) Swap i with random position between i and n

Graph out the probability distribution for these cases and it's easy to see why the first solution is incorrect.

Honestly, I've read your examples a few times and I can't see how they're different; the explanations are different, sure, but the the "for (i is 1 to n)" part is the same both times.
David Thomas
+1  A: 

Well here's the solution I came up with:

function randomize_array_1($array_to_randomize) {
    $new_array = array();
    while (count($array_to_randomize) > 0) {
     $rand_num = rand(0, count($array_to_randomize)-1);
     $extracted = array_splice($array_to_randomize, $rand_num, 1);
     $new_array[] = $extracted[0];
    }
    return $new_array;
}

And here's his solution:

function randomize_array_2($array_to_randomize) {
    usort($array_to_randomize, "rand_sort");
    return $array_to_randomize;
}
function rand_sort($a, $b) {
    return rand(-1, 1);
}

I ran a bunch of trials on both methods (trying each 1,000,000 times) and the speed difference was negligible. However, upon checking the actual randomness of the results I was surprised at how different the distributions were. Here's my results:

randomize_array_1:
    [2, 3, 1] => 166855
    [2, 1, 3] => 166692
    [1, 2, 3] => 166690
    [3, 1, 2] => 166396
    [3, 2, 1] => 166629
    [1, 3, 2] => 166738

randomize_array_2:
    [1, 3, 2] => 147781
    [3, 1, 2] => 73972
    [3, 2, 1] => 445004
    [1, 2, 3] => 259406
    [2, 3, 1] => 49222
    [2, 1, 3] => 24615

As you can see, the first method provides an almost perfect distribution indicating that it's being more-or-less truly random, whereas the second method is all over the place.

fixedd