views:

536

answers:

1

Among the known limitations of Joe Celko's nested sets (modified pre-order traversal) is marked degredation in performance as the tree grows to a large size.

Vadim Tropashko proposed nested intervals, and provides examples and theory explanation in this paper: http://arxiv.org/html/cs.DB/0401014

Is this a viable solution, are there any viable examples (in any language) abstracted away from the native DB layer?

+2  A: 

While I've seen examples for nested sets, I haven't seen much for nested intervals, although in theory it shouldn't be difficult to convert from one to the other. Instead of doing pre-order traversal to label the nodes, do a breadth-first recursion. The trick is to work out the most efficient way of labelling n children of a node. Since the node between a/b and c/d is (a+c)/(b+d), an ill-conditioned insert (for instance, inserting the children left to right), runs the risk of creating the same exponential growth in the index values as, for instance, using a full materialized path. It is not difficult to counteract this effect - create the new indexes one at a time, inserting each at the location that produces the lowest resulting denominator.

As far as performance degradation goes, much depends on the operations you intend to do. There are still some operations that will require a complete relabeling of the entire tree - the nested set or nested interval methods both work best for structures that seldom change. If you are doing a lot of structure changes to the hierarchy, the 'standard' parent-child table structure may be easier to work with. remember too that some operations (such as number of descendants) are far easier with the integer labeling of nested sets than the interval methods.

Chris