views:

584

answers:

3

I was just interviewed with a question, and I'm curious what the answer ought to be. The problem was, essentially:

Say you have an unsorted list of n integers. How do you find the k minimum values in this list? That is, if you have a list of [10, 11, 24, 12, 13] and are looking for the 2 minimum values, you'd get [10, 11].

I've got an O(n*log(k)) solution, and that's my best, but I'm curious what other people come up with. I'll refrain from polluting folks brains by posting my solution and will edit it in in a little while.

EDIT #1: For example, a function like: list getMinVals(list &l, int k)

EDIT #2: It looks like it's a selection algorithm, so I'll toss in my solution as well; iterating over the list, and using a priority queue to save the minimum values. The spec on the priority queue was that the maximum values would end up at the top of the priority queue, so on comparing the top to an element, the top would get popped and the smaller element would get pushed. This assumed the priority queue had an O(log n) push and an O(1) pop.

+5  A: 

This is the quickSelect algorithm. It's basically a quick sort where you only recurse for one part of the array. Here's a simple implementation in Python, written for brevity and readability rather than efficiency.

def quickSelect(data, nLeast) :
    pivot = data[-1]
    less = [x for x in data if x <= pivot]
    greater = [x for x in data if x > pivot]
    less.append(pivot)

    if len(less) < nLeast :
        return less + quickSelect(greater, nLeast - len(less))
    elif len(less) == nLeast :
        return less
    else :
        return quickSelect(less, nLeast)

This will run in O(N) on average, since at each iteration, you are expected to reduce the size of data by a multiplicative constant. The result will not be sorted. The worst case is O(N^2), but this is dealt with in essentially the same way as a quick sort, using things like median-of-3.

dsimcha
The question was for efficiency.
starblue
+4  A: 

This is usually in the algorithm books under selection algorithms or "linear selection". Here's the specific section on min/max k values in a list. It's O(nlog(k)).

tgamblin
A: 

A heap might suffice.

dirkgently