views:

41

answers:

2

I'm thinking of experimenting with using a tree-structure for indexing as I want to test whether it is faster than my current indexing implementation which is in essence a hash based lookup.

I've read up on various questions and articles about performance of B-Trees, AVL-Trees and Red-Black Trees and can't really see much difference between them performance wise.

What tree structure would people recommend and why? Ideally it should have an existing .Net implementation available though I'm not averse to implementing my own if necessary

+5  A: 

A good hashtable is almost always faster than a tree. The great advantage of the tree is that you can use it to query ranges and for ordering. So if you don't need these features I'd rather look into optimizing your hash based solution.

And AFAIK SortedDictionary<K,V> is tree based.

CodeInChaos
I did suspect that, I've done some testing where my data is in MongoDB which uses B-Tree indices and even accounting for the overhead of communicating with MongoDB performance is worse. In some situations it would be useful to have the data in order but the problem is that the order is not defined in advance so maybe a tree won't gain me anything :-S
RobV
The great advantage of a tree is that it does not require a hashing algorithm. Hashing algorithms are almost invariably untested and unexamined against real data.
Blank Xavier
@blank this is a misnomer, auto-generated hashing algorithms based off properties in .NET are tried and true.
Lucas B
Surely also if the things I want to store are objects then I need a hashing/key generating function anyway as relying on value comparison would make it far too slow
RobV
+1  A: 

A binary tree is not balanced. Do not use it.

An AVL tree is better balanced than a red-black tree; it has faster lookup.

Much more importantly, the AVL algorithm is straightforward and comprehendable. This is, IME, not the case for the red-black algorithm. You cannot implement algorithms you do not understand, or rather, you can implement them blindly, which is properly tantamount in my view to -not- being able to implement them.

Blank Xavier
Isn't any tree where each node has up to 2 children a binary tree? I'd view an AVL tree as a special case of a binary tree.
CodeInChaos
Interesting. I find red-black trees easier to understand than AVL trees. Perhaps it's because I learned them first.
Ken
When you say understand, do you -truely- understand them, e.g. the underlying reasons the rules work? or do you mean that you know the set of rules that must be applied? I have -never- seen an explanation of why the RB rules -work-.
Blank Xavier
Blank: When you say "rules", do you mean "procedures for insert/delete", or "invariants"? If I had to think about procedures, I wouldn't even be able to make an ordinary binary search tree. I just remember the invariants, which do make perfect sense to me.
Ken
Ken: by rules, I mean the procedures for insert/delete - the five or so steps that are followed, as you move up the tree from the inserted node. I don't mean remembering the function code itself. But those procedures - yes, they in and of themselves are perfectly sensible - check this is red, move that if not, etc. But the question is *why do they work to give you a balanced tree?* With the AVL algorithm, I *know* why. I can *comprehend* the underlying reasons it must all work. I cannot comprehend the RB algorithm and I have never seen an explanation of it - only a listing of the rules.
Blank Xavier
Blank: Then no, I don't. I can't remember procedures for *any* data structure, no matter how simple, even if I've been using it for decades, and got an A in it back in college. :-) I always just remember the invariants, from which it's easy to write out the procedures. It's obvious to me why the red-black invariants yield a balanced tree (WP has a decent description), and how to maintain them. Maybe that's why AVL trees are so hard for me: I only learned them as a set of procedures. Red-black trees I can picture in my mind (red! and black!), and also exactly what I need to do with them.
Ken