ansaurus

Question

How to find the kth largest element in an unsorted array of length n in O(n)?

Answer 1

A:

iterate through the list. if the current value is larger than the stored largest value, store it as the largest value and bump the 1-4 down and 5 drops off the list. If not,compare it to number 2 and do the same thing. Repeat, checking it against all 5 stored values. this should do it in O(n)

Kevin 2008-10-30 21:10:13

That "bump" is O(n) if you're using an array, or down to O(log n) (I think) if you use a better structure.

Just Some Guy 2008-10-30 21:11:12

It needn't be O(log k) - if the list is a linked list then adding the new element to the top and dropping the last element is more like O(2)

Alnitak 2008-10-30 21:14:38

The bump would be O(k) for an array-backed list, O(1) for an appropriately-linked list. Either way, this sort of question generally assumes it to be of minimal impact compared to n and it introduces no more factors of n.

bobince 2008-10-30 21:16:24

it would also be O(1) if the bump uses a ring-buffer

Alnitak 2008-10-30 21:18:04

Anyhow, the comment's algorithm is incomplete, it fails to consider an element of n coming in which is the new (eg) second-largest. Worst case behaviour, where each element in n must be compared with each in the highscore table, is O(kn) - but that still probably means O(n) in terms of the question.

bobince 2008-10-30 21:21:34

An algorithm given with time O(kn) actually has a worst case of O(n^2) where k=n. Although in that case it would be faster to look for the smallest item. The algorithm could always be reversed in the case where k>n/2 to look for the kth smallest item.

Elie 2008-10-30 21:26:25

Answer 2

+3 A:

A quick Google on that ('kth largest element array') returned this: http://discuss.joelonsoftware.com/default.asp?interview.11.509587.17

"Make one pass through tracking the three largest values so far." (it was specifically for 3d largest)

and..

Build a heap/priority queue.  O(n)
Pop top element.  O(log n)
Pop top element.  O(log n)
Pop top element.  O(log n)

Total = O(n) + 3 O(log n) = O(n)

warren 2008-10-30 21:12:11

well, its actually O(n)+ O( k log n) which doesn't reduce for significant values of K

Jimmy 2008-10-30 21:19:56

right - Big-O is all about approximations :)

warren 2008-10-30 21:21:38

also note: I quoted the site :)

warren 2008-10-30 21:22:42

Tracking can be done with a doubly linked list that you keep at fixed length. The last item should then be the kth largest element. Insertion at the end and removal at the back are both O(1), lookup at the back is O(1) too.

Jasper Bekkers 2008-10-30 21:23:55

But finding the insertion point in that doubly-linked list is O(k).

Just Some Guy 2008-10-30 23:25:12

And if k is fixed, O(k) = O(1)

Tyler McHenry 2009-07-31 17:00:05

@warren: Big-O is approximating, but you always over-approximate. Quicksort is actually O(n^2), for example, since that is the worst case. this one is O(n + k log n).

Claudiu 2010-04-18 18:26:04

Answer 3

+2 A:

A Programmer's Companion to Algorithm Analysis gives a version that is O(n), although the author states that the constant factor is so high, you'd probably prefer the naive sort-the-list-then-select method.

I answered the letter of your question :)

Jimmy 2008-10-30 21:17:23

Answer 4

A:

You can do it in O(n + kn) = O(n) (for constant k) for time and O(k) for space, by keeping track of the k largest elements you've seen.

For each element in the array you can scan the list of k largest and replace the smallest element with the new one if it is bigger.

Warren's priority heap solution is neater though.

Rob Walker 2008-10-30 21:17:49

This would have a worst case of O(n^2) where you're asked for the smallest item.

Elie 2008-10-30 21:23:15

"Smallest item" means that k=n, so k is no longer constant.

Tyler McHenry 2009-07-31 17:01:30

Answer 5

A:

What I would do is this:

initialize empty doubly linked list l
for each element e in array
    if e larger than head(l)
        make e the new head of l
        if size(l) > k
            remove last element from l

the last element of l should now be the kth largest element

You can simply store pointers to the first and last element in the linked list. They only change when updates to the list are made.

Update:

initialize empty sorted tree l
for each element e in array
    if e between head(l) and tail(l)
        insert e into l // O(log k)
        if size(l) > k
            remove last element from l

the last element of l should now be the kth largest element

Jasper Bekkers 2008-10-30 21:19:00

What if e is smaller than head(l)? It could still be larger than the kth largest element, but would never get added to that list. You will need to sort the list of items in order for this to work, in ascending order.

Elie 2008-10-30 21:22:09

You are right, guess I'll need to think this through some more. :-)

Jasper Bekkers 2008-10-30 21:27:33

The solution would be to check if e is between head(l) and tail(l) and insert it at the correct position if it is. Making this O(kn). You could make it O(n log k) when using a binary tree that keeps track of the min and max elements.

Jasper Bekkers 2008-10-30 21:30:04

Answer 6

+8 A:

The keywords you are looking for are selection algorithm: Wikipedia lists a number of different ways of doing this.

Adam Rosenfield 2008-10-30 21:19:32

Answer 7

+6 A:

This is called finding the k-th order statistic. There's a very simple randomized algorithm taking O(n) time, and a pretty complicated non-randomized algorithm taking O(n) time. There's some info in wikipedia but it's not very good. Everything you need is in these powerpoint slides. Also it's very nicely detailed in the book by Cormen et al (Introduction to Algorithms).

eladv 2008-10-30 21:48:05

Answer 8

+1 A:

Read Chapter 9, Medians and Other statistics from Cormen's "Introduction to Algorithms", 2nd Ed. It has an expected linear time algorithm for selection. It's not something that people would randomly come up with in a few minutes.. A heap sort, btw, won't work in O(n), it's O(nlgn).

2008-10-30 21:59:54

Answer 9

+2 A:

The C++ standard library has almost exactly that function, although it does modify your data. It has expected linear run-time, O(N), and it also does a partial sort.

const int N = ...;
double a[N];
// ... 
const int m = ...; // m < N
nth_element (a, a + m, a + N);
// a[m] contains the mth element in a

David Nehme 2008-10-30 22:53:51

No, it has an expected *average* O(n) runtime. For example, quicksort is O(nlogn) on average with a worst case of O(n^2). Wow, something straight up factually wrong!

Just Some Guy 2008-10-30 23:30:58

No, there's nothing factually wrong with this answer. It works and the C++ standard requires an expected linear run time.

David Nehme 2008-10-31 00:21:19

Answer 10

+3 A:

If you want a true O(n) algorithm, as opposed to O(kn) or something like that, then you should use quickselect (it's basically quicksort where you throw out the partition that you're not interested in). My prof has a great writeup, with the runtime analysis:

http://pine.cs.yale.edu/pinewiki/QuickSelect

Ying Xiao 2008-10-31 22:23:11

+1, very detailed explanation indeed

Matthieu M. 2009-10-19 10:52:23

+1, Jim Aspnes FTW

viksit 2010-03-08 22:38:47

Answer 11

A:

i would like to suggest one answer

if we take the first k elements and sort them into a linked list of k values

now for every other value even for the worst case if we do insertion sort for rest n-k values even in the worst case number of comparisons will be k*(n-k) and for prev k values to be sorted let it be k*(k-1) so it comes out to be (nk-k) which is o(n)

cheers

2009-07-31 16:51:47

sorting takes nlogn time... the algorithm should run in linear time

MrDatabase 2009-11-13 19:55:48

Answer 12

+1 A:

http://supermanhelp.com/view%5Fques.aspx?data=633924992399932041.txt

super 2009-10-30 17:42:47

Answer 13

+1 A:

Find the median of the array in linear time, then use partition procedure exactly as in quicksort to divide the array in two parts, values to the left of the median lesser( < ) than than median and to the right greater than ( > ) median, that too can be done in lineat time, now, go to that part of the array where kth element lies, Now recurrence becomes: T(n) = T(n/2) + cn which gives me O (n) overal.

pranjal 2010-04-18 18:21:39

Answer 14

+1 A:

You do like quicksort. Pick an element at random and shove everything either higher or lower. At this point you'll know which element you actually picked, and if it is the kth element you're done, otherwise you repeat with the bin (higher or lower), that the kth element would fall in. Statistically speaking, the time it takes to find the kth element grows with n, O(n).

stinky 2010-06-23 02:14:15

Answer 15

+3 A:

This is a problem of "Order statistics". A good link describing all possible solutions along with code and output is at below link.

http://www.rawkam.com/?p=870

sunil 2010-07-08 08:40:07

ansaurus

tags:

views:

answers:

How to find the kth largest element in an unsorted array of length n in O(n)?

related questions