tags:

views:

472

answers:

7

I'm looking for an algorithm (or PHP code, I suppose) to end up with the 10 lowest numbers from a group of numbers. I was thinking of making a ten item array, checking to see if the current number is lower than one of the numbers in the array, and if so, finding the highest number in the array and replacing it with the current number.

However, I'm planning on finding the lowest 10 numbers from thousands, and was thinking there might be a faster way to do it. I plan on implementing this in PHP, so any native PHP functions are usable.

+10  A: 

Sort the array and use the ten first/last entries.

Honestly: sorting an array with a thousand entries costs less time than it takes you to blink.

Bombe
Hi Bombe! Nice to see you here...
Nils Pipenbrinck
Nils, even nicer to meet you here. :)
Bombe
+1, if you can think of a way to do this faster than quicksort, then you wouldn't have asked this question :D
Kent Fredric
In terms of big-O, it's better to do a single pass looking for the 10 lowest entries (O(n)) rather than a common catch-all sort (O(n log n)). But if you're data set is small and your language is fast (I've never used PHP so I can't comment), this won't matter. So +1.
paxdiablo
+1  A: 

Where are you getting this group of numbers?

If your List of numbers is already in an array you could simply do a sort(), and then a array_slice() to get the first 10.

Zoredache
+3  A: 

Naive approach is to just sort the input. It's likely fast enough, so just try it and profile it before doing anything more complicated.

Potentially faster approach: Linearly search the input, but keep the output array sorted to make it easier to determine if the next input belongs in the array or not. Pseudocode:

output[0-9] = input[0-9];
sort(output);
for i=10..n-1
  if input[i] < output[9]
    insert(input[i])

where insert(x) will find the right spot (binary search) and do the appropriate shifting.

But seriously, just try the naive approach first.

Andrew Coleson
-1. Naive approaches may work for all your tests, and then randomly die one day in the real world. Leave the fancy tricks up to the experts. Use something that exists unless you know you can do better.
Kent Fredric
Sounds like you need more representative tests. And you misread, the "Naive approach" was the first half, which is the same as the answer you voted up. You wanted to disagree with the "potentially faster approach."
Andrew Coleson
@Kent, the naive approach **was** the PHP sort, you appear to have misunderstood. If PHP's sort dies in the real world one day, quite a few people will be in trouble :-).
paxdiablo
$bit->flip(); $self->fail(); do{ appology('I was the naive one'); }
Kent Fredric
No problem. Recommending something error-prone as an alternative to the naive and Wikipedia solutions was probably less helpful anyway. :)
Andrew Coleson
+6  A: 

What you're looking for is called a selection algorithm. The Wikipedia page on the subject has a few subsections in the selecting k smallest or largest elements section. When the list is large enough, you can beat the time required for the naive "sort the whole list and choose the first 10" algorithm.

Rob Kennedy
To be specific, the "Optimised Sorting Algorithms" sections offer O(n + k*lg(k)) performance if the Lowest 10 need to be sorted. If they don't need to be sorted amongst themselves, O(n) is possible!
Andrew Coleson
+1  A: 

I doesn't matter much for a small array, but as it gets larger a fast and easy way to increase processing speed is to take advantage of array key indexing, which for 1 mill. rows will use about 40% of the time. Example:

// sorting array values

$numbers = array();
for($i = 0; $i < 1000000; ++$i)
{
    $numbers[$i] = rand(1, 999999);
}

$start = microtime(true);
sort($numbers);
$res = array_slice($numbers, 0, 10, true);
echo microtime(true) - $start . "\n";
// 2.6612658500671
print_r($res);

unset($numbers, $res, $start);


// sorting array keys

$numbers = array();
for($i = 0; $i < 1000000; ++$i)
{
    $numbers[rand(1, 999999)] = $i;
}

$start = microtime(true);
ksort($numbers);
$res = array_keys(array_slice($numbers, 0, 10, true));
echo microtime(true) - $start . "\n";
// 0.9651210308075
print_r($res);

But if the array data is from a database the fastest is probably to just sort it there:

SELECT number_column FROM table_with_numbers ORDER BY number_column LIMIT 10
Ole J. Helgesen
A: 

Create a sorted set (TreeSet in Java, I don't know about PHP), and add the first 10 numbers. Now iterate over the rest of the numbers Iterate over all your numbers, add the new one, then remove the biggest number from the set.

This algorithm is O(n) if n >> 10.

martinus
A: 

I would use a heap with 10 elements and the highest number at the root of the tree. Then start at the beginning of the list of numbers:

  • If the heap has less than 10 elements: add the number to the list
  • Otherwise, if the number is smaller than the highest number in the heap, remove the highest number in the heap, and then add the current number to the list
  • Otherwise, ignore it.

You will end up with the 10 lowest numbers in the heap. If you are using an array as the heap data structure, then you can just use the array directly.

(alternatively: you can slice out the first 10 elements, and heapify them instead of using the first step above, which will be slightly faster).

However, as other people have noted, for 1000 elements, just sort the list and take the first 10 elements.

FryGuy