views:

666

answers:

7

I'm accessing the minimum element of a binary tree lots of times. What implementations allow me to access the minimum element in constant time, rather than O(log n)?

+10  A: 

Depending on your other requirements, a min-heap might be what you are looking for. It gives you constant time retrieval of the minimum element.

However you cannot do some other operations with the same ease as with a simple binary search tree, like determining if a value is in the tree or not. You can take a look at splay trees, a kind of self-balancing binary tree that provides improved access time to recently accessed elements.

Martinho Fernandes
It needs to keep the full order of elements for other purposes; I'm accessing the top-element a lot though.
Rudiger
If the element you access frequently is actually the "top" or root of the tree (and not the minimum *value* element) then accessing that directly should be O(1) already since no traversal is required.
quixoto
@Ben: I think Rudiger meant the top-element, if implemented with a heap (my answer originally mentioned only heaps)
Martinho Fernandes
@Rudiger: Just maintaining min-heap in addition to the tree should work for arbitrary insertions/deletions. Is that not an option?
Moron
@Moron: that is a nice idea that can work for some cases if you can spare the extra structural memory overhead.
Martinho Fernandes
Heaps don't inherently take a lot of structural overhead; it's zero for an array-style implementation with inline data. It's only when you want to keep the nodes fixed in memory and link to an external structure that it adds up. Then you're stuck with a 3-pointer tree plus maybe 2 pointers linking it to the actual records, to be able to find nodes to update/delete them.
Potatoswatter
@Potatoswatter that's what I meant, to keep an extra structure (be it an array or something else) you need an extra set of references. The array has to have references in it. I count references as structure, as values as content.
Martinho Fernandes
Another thing we do is (if extra memory is a worry), walk the the tree each time there is an insert/delete to find the min and cache it. FindMin should be O(1), but insert and delete are still O(log n).
Moron
+9  A: 

Find it once in O(log n) and then compare new values which you are going to add with this cached minimum element.

UPD: about how can this work if you delete the minimum element. You'll just need to spend O(log n) one more time and find new one.

Let's imagine that you have 10 000 000 000 000 of integers in your tree. Each element needs 4 bytes in memory. In this case all your tree needs about 40 TB of harddrive space. Time O (log n) which should be spent for searching minimum element in this huge tree is about 43 operations. Of course it's not the simplest operations but anyway it's almost nothing even for 20 years old processors.

Of course this is actual if it's a real-world problem. If for some purposes (maybe academical) you need real O(1) algorithm then I'm not sure that my approach can give you such performance without using additional memory.

Roman
If the minimum element gets removed from the tree, you need to find the new minimum. Does this affect this approach?
Rudiger
If you have control over the tree implementation: you will encounter the new minimum element when deleting the minimum element from the tree, so you can update the minimum element in O(1) additional time. (Of course, the actual removal of the smallest element is still going to cost O(log n).)
meriton
Just update the cached minimum each time there is an insert/delete. FindMin will still be O(1) and inserts/deletes still O(log n), assuming RB or some similar balanced tree implementation. If the tree allows you to walk it (like dfs) there is no need to mess with the internal implementation.
Moron
+3  A: 

Walking the tree will always be O(log n). Did you write the tree implementation yourself? You can always simply stash a reference to the current lowest value element alongside your data structure and keep it updated when adding/removing nodes. (If you didn't write the tree, you could also do this by wrapping the tree implementation in your own wrapper object that does the same thing.)

quixoto
+1  A: 

If by minimum element you mean element with the smallest value then you could use TreeSet with a custom Comparator which sorts the items into correct order to store individual elements and then just call SortedSet#first() or #last() to get the biggest/smallest values as efficiently as possible.

Note that inserting new items to TreeSet is slightly slow compared to other Sets/Lists but if you don't have a huge amount of elements which constantly change then it shouldn't be a problem.

Esko
I'm pretty sure this (a red-black tree) would be `O(log n)`.
Rudiger
Rudiger is correct: a TreeSet is implemented with a TreeMap (is in the docs), and a TreeMap is a red-black tree: http://java.sun.com/javase/6/docs/api/java/util/TreeMap.html
Martinho Fernandes
If the elements are already `Comparable`, you don't even need the `Comparator`.
trashgod
+4  A: 

This may sound silly, but if you mostly access the minimum element, and don't change the tree too much, maintaining a pointer to the minimal element on add/delete (on any tree) may be the best solution.

Ofir
Notice that this will work only if you are the one implementing the tree. If you take a given tree (with a given interface), maintaining a pointer might not be possible (but the value is...).
Anna
Adding/removing nodes in a BST is O(log n), so maintaining the minimum pointer at the same point in time doesn't change the algorithmic complexity. The Linux kernel's CFS scheduler does this for its rbtree of tasks, actually: see `cfs_rq.rb_leftmost` in `/usr/src/linux/kernel/sched_fair.c`.
ephemient
+2  A: 

There is a implementation in the TAOCP that uses the "spare" pointers in non-full nodes to complete a double linked list along the nodes in order (I don't recall the detail right now, but I image you have to have a "has_child" flag for each direction to make it work).

With that and a start pointer, you could have the starting element in O(1) time.

This solution is not faster or more efficient that caching the minimum.

dmckee
Do you mean this: http://en.wikipedia.org/wiki/Threaded_binary_tree ?
Martinho Fernandes
@Martinho: That would be the one. Good link.
dmckee
A: 

If you can use a little memory it sounds like a combined collection might work for you.

For instance, what you are looking for sounds a lot like a linked list. You can always get to the minimum element but an insert or lookup of an arbitrary node might take longer because you have to do a lookup O(n)

If you combine a linked list and a tree you might get the best of both worlds. To look up an item for get/insert/delete operations you would use the tree to find the element. The element's "holder" node would have to have ways to cross over from the tree to the linked list for delete operations. Also the linked list would have to be a doubly linked list.

So I think getting the smallest item would be O(1), any arbitrary lookup/delete would be O(logN)--I think even an insert would be O(logN) because you could find where to put it in the tree, look at the previous element and cross over to your linked list node from there, then add "next".

Hmm, this is starting to seem like a really useful data structure, perhaps a little wasteful in memory, but I don't think any operation would be worse than O(logN) unless you have to re balance the tree.

Bill K