views:

308

answers:

5

I need a data structure that always holds the n largest items inserted so far (in no particular order).

So, if n is 3, we could have the following session where I insert a few numbers and the content of the container changes:

[]  // now insert 1
[1] // now insert 0
[1,0] // now insert 4
[1,0,4] // now insert 3
[1,4,3] // now insert 0
[1,4,3] // now insert 3
[4,3,3]

You get the idea. What's the name of the data structure? What's the best way to implement this? Or is this in some library?

I am thinking to use a container that has a priority_queue for its elements (delegation), which uses the reverse comparison, so pop will remove the smallest element. So the insert function first checks if the new element to be inserted is greater than the smallest. If so, we throw that smallest out and push the new element.

(I have a C++ implementation in mind, but the question is language-agnostic nevertheless.)

+2  A: 

A bounded priority queue, I think... Java has something like this in its standard library. EDIT: it's called LinkedBlockingQueue. I'm not sure if the C++ STL includes something similar.

David Zaslavsky
It is not LinkedBlockingQueue, which blocks if size is exceeded instead of dropping smaller elements.
starblue
ah well, I thought there was one in the library, but it appears not.
David Zaslavsky
+3  A: 

A priority_queue is the closest thing in C++ with STL. You could wrap it in another class to create your own implementation that trims the size automatically.

Language-agnostically (although maybe not memory-fragmentation-safely):

  1. Insert data
  2. Sort
  3. Delete everything after the nth element

std::priority_queue does step 2 for you.

The PQ also provides for a test to do nothing if the new element is larger than the largest stored.
Henk Holterman
+7  A: 
comingstorm
As far as I can see this _is_ the priority solution proposed by the OP with the added optimization of folding the remove_smallest and insert_new actions.
Henk Holterman
+1  A: 

Isn't it possible to just take the first n elements from a sorted collection?

Rick
Yes, but that's not efficient in terms of space. I don't want to have to store all elements in this data structure. I'm be interested in a very small subset only. But for many cases that would be a fine solution.
+1  A: 

In Java you can use a SortedSet implemented e.g. by a TreeSet. After each insertion check if the set is too large, if yes remove the last element.

This is reasonably efficient, I have used it successfully for solving several Project Euler problems.

starblue