views:

35

answers:

2

For example, I have a large set of bulletins which are organized in a tree (category tree). All categories inclusive. So if I watching items in category A, I'm also see all items in subcategories of this category A. And if I'm watching at root I'm see all the bulletins.

I want to achieve horizontal scalability here, because of large size of set.

The only way I know is so called scatter-gather. Scatter-gather means that I'm sending query on all nodes (or subset of all nodes if know where data lays exactly) and after response of data nodes I'm gathering result. The problem with this method is that I can not do paging on data nodes. And this leads to large traffic on network, because of sorting and paging must be done on coordinator node (node which initiates scatter-gather).

Is there any other possible ways exists for distributing tree structures over set of servers for achieving horizontal scalability?

+1  A: 

You might want to look into a tree modeled as nested Distributed Hash Tables? Just an idea.

Alex Miller
Yeah, I saw it. DHT very promising in present of node failure. But it's work similar to scatter-gather in terms of network bandwidth.By the way, is there any implementation of DHT on the top of Terracotta? ;)
dotsid
I don't know of a DHT on Terracotta but you might find the open source Terrastore project interesting: http://code.google.com/p/terrastore/
Alex Miller
+1  A: 

You might consider using some sort of B-tree, where adjacent values will tend to be in the same block, and hence on the same server.

Joe Mabel