views:

148

answers:

5
+3  Q: 

Tree-like queues

I'm implementing a interpreter-like project for which I need a strange little scheduling queue. Since I'd like to try and avoid wheel-reinvention I was hoping someone could give me references to a similar structure or existing work. I know I can simply instantiate multiple queues as I go along, I'm just looking for some perspective by other people who might have better ideas than me ;)

I envision that it might work something like this: The structure is a tree with a single root. You get a kind of "insert_iterator" to the root and then push elements onto it (e.g. a and b in the example below). However, at any point you can also split the iterator into multiple iterators, effectively creating branches. The branches cannot merge into a single queue again, but you can start popping elements from the front of the queue (again, using a kind of "visitor_iterator") until empty branches can be discarded (at your discretion).

            x -> y -> z
a -> b -> { g -> h -> i -> j }
            f -> b

Any ideas? Seems like a relatively simple structure to implement myself using a pool of queues but I'm following the "think first, code later" strategy :)

Thanks

EDIT: I thought I'd add some extra background info. It's not relevant for the problem, but I thought it might help to clarify my objectives a bit. Very roughly the idea behind this structure is that it is basically used to scheduling computations... A branch can end in either a COMMIT or a ROLLBACK. If any of x -> ..., g -> ... or f -> ... ends in a COMMIT, then

a -> b

is executed in sequence as well as the branch that ended in COMMIT. E.g.

x -> y -> z -> COMMIT

However, a -> b will only be executed once when at least one of the branches are committed. If all three of the branches end in ROLLBACK then the entire tree is discarded including the initial events a -> b.

Thanks for the great answers so far! I'll review them in detail as soon as I get home again.

A: 

It looks to me like the structure is a Tree, but not the balanced or binary kind. If you want full control over how new nodes are added, then you'll have to specify how that is done, e.g. a.addSibling(b).

As this is for scheduling, I am guessing siblings should be visited roughly at the same time. Your visitor, instead of backtracking, will have to spawn other visitors for where you have branching. So the third element to pop would be x, g and f.

It might help you to look at JGraphT.

kiwicptn
A: 

To me, it looks more like a tree (a general one, not binary), than a queue. The semantics of deleting a node need to be well-defined though.

BTW, the mention of "scheduling queue" rings the bell for Priority Queue too.

ArunSaha
+2  A: 

The Boost Graph Library contains a data structure called a disjoint set which models the structure you need here (an interlinked group of sets).

Another way to think of this data structure is as a Forest. A forest is a disjoint union of trees.

Quicksilver
A: 

K, I looked at all the given answers quite seriously, especially the boost implementation of disjoint set suggested by Quicksilver. (Found some additional references here: http://en.wikipedia.org/wiki/Disjoint-set_data_structure and here: http://www.boost.org/doc/libs/1_42_0/libs/disjoint_sets/disjoint_sets.html).

However the conclusion I've come to is that while my data structure is indeed a tree as suggested by others, the algorithmic requirements of the structure matches that of a queue much better. The primary operations I will be using is push_front and pop_back while I don't need search, merge and such. Hence, I feel a pool of queues with a "indexing tree" defined over them will serve me better in this case.

Supposing I insert all elements into the queue in left-to-right, top-down order then a single queue could work well because I'll be popping from the "root" of the tree and pushing onto the end of the buffer. However, once I've popped the root string of elements then the next branch I start "popping" from will not necessarily be next in memory. Similarly, if I roll back a branch then it will not necessarily be at the end of the buffer. Both of these situations will obviously leave gaps in memory. So it's possible that a more sophisticated indexing structure over a simple queue could work, but I suspect it probably wouldn't be worth the effort.

So for now I'm favoring the simple idea of a pool of queues. Once a new queue is needed, select it from the pool or create a new one if none is available and then add it to a tree. If the queue is empty, return it to the pool and delete the tree node. The pool will itself be a priority queue with the largest allocated buffers at the front and the smallest at the back. After a while little or no new memory will be allocated (assuming that the amount of "popping" that occurs is more or less the same as the amount of "pushing" ;))

Thanks for all the suggestions, if you have additional comments on this strategy, feel free to comment - I'll vote up anything that seems useful.

Rehno Lindeque
A: 

You say,

x -> y -> z -> COMMIT

and it may also refer to a tree-like queue, that includes a multiple N of y, set of state/data checks and is executed from the top (z), but receives state changes from the bottom (x's).

x[N]("data check") -> x[N-1]("data check") -> x[N-2]("data

check") -> check -> y->check -> y->state -> z -> check -> z->state -> z->commit.

it seems hard to support this structure, if the queue is coded from smaller to bigger tasks, from x to z, but in the end, did you implemented it exactly this way?

For me (mine's similar question http://stackoverflow.com/questions/3929679/what-are-patterns-types-of-task-queues-can-the-multi-level-task-queue-exist-in-f) is seems to be a N-level structure that traverses child nodes with three kind of state machines available at each level: , $me->tryCommit( tryAdvanceChildsBelow( tryAdvanceToNextSiblingStep( getNextSibling())) and a messy implementation to be rewritten.

mhambra