ansaurus

Question

Find the x smallest integers in a list of length n

Answer 1

+3 A:

Add all n numbers to a heap and delete x of them. Complexity is O((n + x) log n). Since x is obviously less than n, it's O(n log n).

Mark Peters 2010-09-21 20:50:43

No need to keep all of the numbers in the heap, just the N smallest so far. Let me expand on that. Use a max heap. Add a number. If the count > N, then remove the first item from the heap.

Jim Mischel 2010-09-21 21:27:01

Yes, that was covered well by @Aaron so I'll leave this answer independent of that.

Mark Peters 2010-09-21 21:33:09

First `O(n log n)` solution gets an upvote.

Dave Aaron Smith 2010-09-21 21:53:18

Answer 2

A:

Psudocode:

def x_smallest(array<int> arr, int limit)
    array<int> ret = new array[limit]

    ret = {INT_MAX}

    for i in arr
        for j in range(0..limit)
            if (i < ret[j])
                ret[j] = i
            endif
        endfor
    endfor

    return ret
enddef

Mentalikryst 2010-09-21 20:54:21

Most thorough `O(n * x)`.

Dave Aaron Smith 2010-09-21 21:52:35

Answer 3

A:

In pseudo code:

y = length of list / 2

if (x > y)
  iterate and pop off the (length - x) largest
else
  iterate and pop off the x smallest

O(n/2 * x) ?

zilverdistel 2010-09-21 20:55:11

Ah, but the list doesn't come sorted. The smallest integer could appear way at the end.

Dave Aaron Smith 2010-09-21 21:00:54

and O(n/2 *x) = O(n*x)

aaronasterling 2010-09-21 21:09:18

and sorting a list of x elements is probably kind of fast.

kriss 2010-09-21 21:22:29

@erickson, correct, I didn't. @zilverdistel's algorithm depends on the entire list coming presorted.

Dave Aaron Smith 2010-09-21 21:48:55

@Dave Aaron Smith - Sorry, I misread your comment as being about the order of results.

erickson 2010-09-21 22:18:46

@Dave: I don't think it does, but his complexity is wrong. He's just doing a slight optimization that improves the worst case. Whereas the worst case of your trivial solution is O(n * x), it is O(n * n) as x -> n. This would be O(n * n/2) as x ->n. But that 1/2 is just a constant so it shouldn't be factored into the complexity analysis.

Mark Peters 2010-09-22 13:31:41

Answer 4

A:

sort array
slice array 0 x

Choose the best sort algorithm and you're done: http://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms

rjack 2010-09-21 21:15:49

Most succinct of the low hanging fruit answers.

Dave Aaron Smith 2010-09-21 21:56:14

Answer 5

+6 A:

Maintain the list of the x highest so far in sorted order in a skip-list. Iterate through the array. For each element, find where it would be inserted in the skip list (log x time). If in the interior of the list, it is one of the smallest x so far, so insert it and remove the element at the end of the list. Otherwise do nothing.

Time O(n*log(x))

Alternative implementation: maintain the collection of x highest so far in a max-heap, compare each new element with top element of the heap, and pop + insert new element only if the new element is less than the top element. Since comparison to top element is O(1) and pop/insert O(log x), this is also O(nlog(x))

Aaron 2010-09-21 21:18:03

I would probably use a [self-balancing binary search tree](http://en.wikipedia.org/wiki/Self-balancing_binary_search_tree) instead of a skip list, but otherwise, that's the way I would go.

svick 2010-09-21 21:22:40

@svick: The point of the skip list is that removals from the head are O(1). Of course, the list would have to be oriented slightly from Aaron's descriptions so that the maximum value is at the head and the smallest is at the tail instead of the other way around. A removal of the maximum value in a BST would be O(log(x)) which would not change the overall complexity, but would certainly add a higher constant factor. Plus, the rebalancing schemes themselves are sometimes more complex than relinking a node in a list. However, I wonder if there is a clever way to do this with a splay tree?

Brian Gideon 2010-09-22 03:25:38

Answer 6

+3 A:

If the range of numbers (L) is known, you can do a modified counting sort.

given L, x, input[]
counts <- array[0..L]
for each number in input
    increment counts[number]
next

#populate the output
index <- 0
xIndex <- 0
while xIndex < x and index <= L
   if counts[index] > 0 then
       decrement counts[index]
       output[xIndex] = index
       increment xIndex
   else
       increment index
   end if
loop

This has a runtime of O(n + L) (with memory overhead of O(L)) which makes it pretty attractive if the range is small (L < n log n).

Mark Peters 2010-09-21 21:18:42

I'll upvote that. However, let me go clarify that the range of integers is not known.

Dave Aaron Smith 2010-09-21 21:39:44

If it's not known, you can still do a single pass over the list in O(n) time to determine L, and then decide whether it's worth doing this way or not.

Mark Peters 2010-09-21 23:38:35

Another good point. Also, you've described the appropriate range. Kudos.

Dave Aaron Smith 2010-09-22 17:01:03

Answer 7

A:

You can sort then take the first x values?

Java: with QuickSort O(n log n)

import java.util.Arrays;
import java.util.Random;

public class Main {

    public static void main(String[] args) {
        Random random = new Random(); // Random number generator
        int[] list = new int[1000];
        int lenght = 3;

        // Initialize array with positive random values
        for (int i = 0; i < list.length; i++) {
            list[i] = Math.abs(random.nextInt());
        }

        // Solution
        int[] output = findSmallest(list, lenght);

        // Display Results
        for(int x : output)
            System.out.println(x);
    }

    private static int[] findSmallest(int[] list, int lenght) {
        // A tuned quicksort
        Arrays.sort(list);
        // Send back correct lenght
        return Arrays.copyOf(list, lenght);     
    }

}

Its pretty fast.

Bernie Perez 2010-09-21 21:19:40

Most thorough of the low hanging fruit answers.

Dave Aaron Smith 2010-09-21 21:56:40

Answer 8

A:

    private static int[] x_smallest(int[] input, int x)
    {
        int[] output = new int[x];
        for (int i = 0; i < x; i++) { // O(x)
            output[i] = input[i];
        }

        for (int i = x; i < input.Length; i++) { // + O(n-x)
            int current = input[i];
            int temp;

            for (int j = 0; j < output.Length; j++) { // * O(x)
                if (current < output[j]) {
                    temp = output[j];
                    output[j] = current;
                    current = temp;
                } 
            }
        }

        return output;
    }

Looking at the complexity: O(x + (n-x) * x) -- assuming x is some constant, O(n)

Peter Leppert 2010-09-21 21:22:58

Answer 9

+9 A:

You can find the k-th smallest element in O(n) time. This has been discussed on StackOverflow before. There are relatively simple randomized algorithms, such as QuickSelect, that run in O(n) expected time and more complicated algorithms that run in O(n) worst-case time.

Given the k-th smallest element you can make one pass over the list to find all elements less than the k-th smallest and you are done. (I assume that the result array does not need to be sorted.)

Overall run-time is O(n).

George Eadon 2010-09-21 21:22:59

This assumes elements are unique. It gets a bit more complicated since if the kth-element is not unique, your selection breaks down. You would select any elements less than the kth-smallest, and then fill the rest of the array with the value of the kth-smallest. I believe the complexity remains the same.

Mark Peters 2010-09-21 21:30:38

It gets more interesting if you're wanting to do some sort of order-preserving select (e.g., if you've really got compound values and are only comparing part of them, the key, and yet care about the payload). You can still do it in a single pass through the bulk of the data though, giving O(kn) (which tends to O(n) when k≪n).

Donal Fellows 2010-09-21 21:38:58

@Mark Peters - agreed.

George Eadon 2010-09-21 21:39:43

@Donal, good point about order preserving. I will clarify that you don't care about the order, you just want the x smallest.

Dave Aaron Smith 2010-09-21 21:41:47

Answer 10

+1 A:

def x_smallest(items, x):
    result = sorted(items[:x])
    for i in items[x:]:
        if i < result[-1]:
            result[-1] = i
            j = x - 1
            while j > 0 and result[j] < result[j-1]:
                result[j-1], result[j] = result[j], result[j-1]
                j -= 1
    return result

Worst case is O(x*n), but will typically be closer to O(n).

Mark Ransom 2010-09-21 21:25:54

Answer 11

A:

What about using a splay tree? Because of the splay tree's unique approach to adaptive balancing it makes for a slick implementation of the algorithm with the added benefit of being able to enumerate the x items in order afterwards. Here is some psuedocode.

public SplayTree GetSmallest(int[] array, int x)
{
  var tree = new SplayTree();
  for (int i = 0; i < array.Length; i++)
  {
    int max = tree.GetLargest();
    if (array[i] < max || tree.Count < x)
    {
      if (tree.Count >= x)
      {
        tree.Remove(max);
      }
      tree.Add(array[i]);
    }
  }
  return tree;
}

The GetLargest and Remove operations have an amortized complexity of O(log(n)), but because the last accessed item bubbles to the top it would normally be O(1). So the space complexity is O(x) and the runtime complexity is O(n*log(x)). If the array happens to already be ordered then this algorithm would acheive its best case complexity of O(n) with either an ascending or descending ordered array. However, a very odd or peculiar ordering could result in a O(n^2) complexity. Can you guess how the array would have to be ordered for that to happen?

Brian Gideon 2010-09-22 20:55:24

Interesting. I'd never heard of a splay tree. I believe you meant `if (array[i] < max or tree.Count < x)`, yeah? As your pseudo code stands, I believe the splay tree would never get more than one integer if you encountered the smallest integer first.

Dave Aaron Smith 2010-09-22 21:12:41

@Dave: Yep, fixed!

Brian Gideon 2010-09-22 23:22:06

ansaurus

tags:

views:

answers:

Find the x smallest integers in a list of length n

related questions