tags:

views:

305

answers:

11

You have a list of n integers and you want the x smallest. For example,

x_smallest([1, 2, 5, 4, 3], 3) should return [1, 2, 3].

I'll vote up unique runtimes within reason and will give the green check to the best runtime.

I'll start with O(n * x): Create an array of length x. Iterate through the list x times, each time pulling out the next smallest integer.

Edits

  • You have no idea how big or small these numbers are ahead of time.
  • You don't care about the final order, you just want the x smallest.
  • This is already being handled in some solutions, but let's say that while you aren't guaranteed a unique list, you aren't going to get a degenerate list either such as [1, 1, 1, 1, 1] either.
+3  A: 

Add all n numbers to a heap and delete x of them. Complexity is O((n + x) log n). Since x is obviously less than n, it's O(n log n).

Mark Peters
No need to keep all of the numbers in the heap, just the N smallest so far. Let me expand on that. Use a max heap. Add a number. If the count > N, then remove the first item from the heap.
Jim Mischel
Yes, that was covered well by @Aaron so I'll leave this answer independent of that.
Mark Peters
First `O(n log n)` solution gets an upvote.
Dave Aaron Smith
A: 

Psudocode:

def x_smallest(array<int> arr, int limit)
    array<int> ret = new array[limit]

    ret = {INT_MAX}

    for i in arr
        for j in range(0..limit)
            if (i < ret[j])
                ret[j] = i
            endif
        endfor
    endfor

    return ret
enddef
Mentalikryst
Most thorough `O(n * x)`.
Dave Aaron Smith
A: 

In pseudo code:

y = length of list / 2

if (x > y)
  iterate and pop off the (length - x) largest
else
  iterate and pop off the x smallest

O(n/2 * x) ?

zilverdistel
Ah, but the list doesn't come sorted. The smallest integer could appear way at the end.
Dave Aaron Smith
and O(n/2 *x) = O(n*x)
aaronasterling
and sorting a list of x elements is probably kind of fast.
kriss
@erickson, correct, I didn't. @zilverdistel's algorithm depends on the entire list coming presorted.
Dave Aaron Smith
@Dave Aaron Smith - Sorry, I misread your comment as being about the order of results.
erickson
@Dave: I don't think it does, but his complexity is wrong. He's just doing a slight optimization that improves the worst case. Whereas the worst case of your trivial solution is O(n * x), it is O(n * n) as x -> n. This would be O(n * n/2) as x ->n. But that 1/2 is just a constant so it shouldn't be factored into the complexity analysis.
Mark Peters
A: 
sort array
slice array 0 x

Choose the best sort algorithm and you're done: http://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms

rjack
Most succinct of the low hanging fruit answers.
Dave Aaron Smith
+6  A: 

Maintain the list of the x highest so far in sorted order in a skip-list. Iterate through the array. For each element, find where it would be inserted in the skip list (log x time). If in the interior of the list, it is one of the smallest x so far, so insert it and remove the element at the end of the list. Otherwise do nothing.

Time O(n*log(x))

Alternative implementation: maintain the collection of x highest so far in a max-heap, compare each new element with top element of the heap, and pop + insert new element only if the new element is less than the top element. Since comparison to top element is O(1) and pop/insert O(log x), this is also O(nlog(x))

Aaron
I would probably use a [self-balancing binary search tree](http://en.wikipedia.org/wiki/Self-balancing_binary_search_tree) instead of a skip list, but otherwise, that's the way I would go.
svick
@svick: The point of the skip list is that removals from the head are O(1). Of course, the list would have to be oriented slightly from Aaron's descriptions so that the maximum value is at the head and the smallest is at the tail instead of the other way around. A removal of the maximum value in a BST would be O(log(x)) which would not change the overall complexity, but would certainly add a higher constant factor. Plus, the rebalancing schemes themselves are sometimes more complex than relinking a node in a list. However, I wonder if there is a clever way to do this with a splay tree?
Brian Gideon
+3  A: 

If the range of numbers (L) is known, you can do a modified counting sort.

given L, x, input[]
counts <- array[0..L]
for each number in input
    increment counts[number]
next

#populate the output
index <- 0
xIndex <- 0
while xIndex < x and index <= L
   if counts[index] > 0 then
       decrement counts[index]
       output[xIndex] = index
       increment xIndex
   else
       increment index
   end if
loop

This has a runtime of O(n + L) (with memory overhead of O(L)) which makes it pretty attractive if the range is small (L < n log n).

Mark Peters
I'll upvote that. However, let me go clarify that the range of integers is not known.
Dave Aaron Smith
If it's not known, you can still do a single pass over the list in O(n) time to determine L, and then decide whether it's worth doing this way or not.
Mark Peters
Another good point. Also, you've described the appropriate range. Kudos.
Dave Aaron Smith
A: 

You can sort then take the first x values?

Java: with QuickSort O(n log n)

import java.util.Arrays;
import java.util.Random;

public class Main {

    public static void main(String[] args) {
        Random random = new Random(); // Random number generator
        int[] list = new int[1000];
        int lenght = 3;

        // Initialize array with positive random values
        for (int i = 0; i < list.length; i++) {
            list[i] = Math.abs(random.nextInt());
        }

        // Solution
        int[] output = findSmallest(list, lenght);

        // Display Results
        for(int x : output)
            System.out.println(x);
    }

    private static int[] findSmallest(int[] list, int lenght) {
        // A tuned quicksort
        Arrays.sort(list);
        // Send back correct lenght
        return Arrays.copyOf(list, lenght);     
    }

}

Its pretty fast.

Bernie Perez
Most thorough of the low hanging fruit answers.
Dave Aaron Smith
A: 
    private static int[] x_smallest(int[] input, int x)
    {
        int[] output = new int[x];
        for (int i = 0; i < x; i++) { // O(x)
            output[i] = input[i];
        }

        for (int i = x; i < input.Length; i++) { // + O(n-x)
            int current = input[i];
            int temp;

            for (int j = 0; j < output.Length; j++) { // * O(x)
                if (current < output[j]) {
                    temp = output[j];
                    output[j] = current;
                    current = temp;
                } 
            }
        }

        return output;
    }

Looking at the complexity: O(x + (n-x) * x) -- assuming x is some constant, O(n)

Peter Leppert
+9  A: 

You can find the k-th smallest element in O(n) time. This has been discussed on StackOverflow before. There are relatively simple randomized algorithms, such as QuickSelect, that run in O(n) expected time and more complicated algorithms that run in O(n) worst-case time.

Given the k-th smallest element you can make one pass over the list to find all elements less than the k-th smallest and you are done. (I assume that the result array does not need to be sorted.)

Overall run-time is O(n).

George Eadon
This assumes elements are unique. It gets a bit more complicated since if the kth-element is not unique, your selection breaks down. You would select any elements less than the kth-smallest, and then fill the rest of the array with the value of the kth-smallest. I believe the complexity remains the same.
Mark Peters
It gets more interesting if you're wanting to do some sort of order-preserving select (e.g., if you've really got compound values and are only comparing part of them, the key, and yet care about the payload). You can still do it in a single pass through the bulk of the data though, giving O(kn) (which tends to O(n) when k≪n).
Donal Fellows
@Mark Peters - agreed.
George Eadon
@Donal, good point about order preserving. I will clarify that you don't care about the order, you just want the x smallest.
Dave Aaron Smith
+1  A: 
def x_smallest(items, x):
    result = sorted(items[:x])
    for i in items[x:]:
        if i < result[-1]:
            result[-1] = i
            j = x - 1
            while j > 0 and result[j] < result[j-1]:
                result[j-1], result[j] = result[j], result[j-1]
                j -= 1
    return result

Worst case is O(x*n), but will typically be closer to O(n).

Mark Ransom
A: 

What about using a splay tree? Because of the splay tree's unique approach to adaptive balancing it makes for a slick implementation of the algorithm with the added benefit of being able to enumerate the x items in order afterwards. Here is some psuedocode.

public SplayTree GetSmallest(int[] array, int x)
{
  var tree = new SplayTree();
  for (int i = 0; i < array.Length; i++)
  {
    int max = tree.GetLargest();
    if (array[i] < max || tree.Count < x)
    {
      if (tree.Count >= x)
      {
        tree.Remove(max);
      }
      tree.Add(array[i]);
    }
  }
  return tree;
}

The GetLargest and Remove operations have an amortized complexity of O(log(n)), but because the last accessed item bubbles to the top it would normally be O(1). So the space complexity is O(x) and the runtime complexity is O(n*log(x)). If the array happens to already be ordered then this algorithm would acheive its best case complexity of O(n) with either an ascending or descending ordered array. However, a very odd or peculiar ordering could result in a O(n^2) complexity. Can you guess how the array would have to be ordered for that to happen?

Brian Gideon
Interesting. I'd never heard of a splay tree. I believe you meant `if (array[i] < max or tree.Count < x)`, yeah? As your pseudo code stands, I believe the splay tree would never get more than one integer if you encountered the smallest integer first.
Dave Aaron Smith
@Dave: Yep, fixed!
Brian Gideon