views:

60

answers:

1

I have a directed acyclic graph, an example of which may look like:

                     | 
                    (1)
                   / | \
                  /  |  \
                (3) (2) (4)
                /  / | \  \
               /  /  |  \  \
              /  /   |   \  \
            (6)(7)  (5)  (8)(9)
             |  |    |    |  |
           (10)(11) (12)(13)(14)
             \   \   |    /  /
              \   \ (15)_/  /
               \   \ |     /
                \___(16)__/
                     |  
                     |  
  • Execution of each node happens downwards.
  • Where a node has more than one outgoing branch, a copy of the graph is made and the chosen branch executes in a different process.
  • Where a node has more than one incoming branch the copies of the graph are routed back to the master process so they can be merged, so the master copy has all of the changes made to the branches (in the separate processes).
  • Processes are long running and transient, so I can't/don't want to route every node back to the master after it has been executed - I only want to merge at the master when a large chunk of work (a branch) has been completed remotely.

So for example, at node (1) there are three branches that must synchronise at node (16). In itself, this is relatively simple - mark node (16) as the synch for node (1) and merge from node (1) to (16) down the branch being synched when execution hits (16).

However, node (2) has two branches that also must synchronise at node (16) (it also has two that must synch at node (15)).

The problem is, in this example, when execution hits node (16) it doesn't know how far back up the tree to start synchronising from (i.e. node (1) or (2)).

I need something like a graph colouring scheme whereby different paths of execution provide their own pointers to the node from which they originated, so when the path (11) -> (16) is activated, the execution knows that the portion of the graph to be merged starts at node (2).

Is there some bit of theory or an algorithm that can help here? Or am I approaching the problem in the wrong way?

+1  A: 

Topological sorting is what you are looking for. You can use the algorithm to partition the nodes of the graph into three classes of nodes for every fixed node X - predecessor nodes of X, successor nodes of X, and nodes independent from X.

Note that your graph must be acyclic for the algorithm while you stated your graphs are cyclic (but I can see no cycle in your example).

ALGORITHM

  1. Take the set of direct predecessor nodes DP of node X.
  2. For each direct predecessor node Ni in DP find the set of predecessor nodes Pi.
  3. Find the set of common predecessor nodes CP by intersecting all Pi.
  4. Find the unique successorless node in CP (if CP is non-empty).

EXAMPLE

Let's look at node 15. There arew two direct predecessors 12 and 13. Now find all predecessors of this both nodes - for 12 they are 5, 2, and 1. For 13 they are 8, 2, and 1. The intersection of this sets is 2 and 1 therefore this two nodes are the common predecessors and node 2 is the common predecessor without successor (while node 1 is a common predecessor but has node 2 as successor). Therefore at node 15 two branches originating from node 2 join.

Daniel Brückner
No you're right, they are acyclic, sorry.
MalcomTucker
Thank you for the suggestion. I can see how this might be useful but I'm not certain of the right way to use it. So if I have understood, the idea is that you would flatten/sort the master and copy graphs, and work backwards from the current node pushing changes into the master for any node with, say, a greater timestamp in the copy than the master? Does that make sense?
MalcomTucker
I added an example - it is all about finding predecessors and successors and looking for common subsets. I hope this helps.
Daniel Brückner
Thanks for your help, I think I can use this, the example of node 15 is clear, however I am still uncertain what I would do at 16 where there are two different independent parent nodes (1) and (2). Would if execution came through the path 11->16 would I still merge everything from (1) downwards? If that makes sense ..
MalcomTucker
I don't fully unterstand for what and how you will use the algorithm in the end, but to me it seems reasonable that at node 16 branches from node 1 join. Okay, there was some further splitting at node 2 and some joining at node 15 inbetween but ultimatly all branches joining at node 16 orginated from node 1.
Daniel Brückner
Yeah that's what I figured, and of course that aligns with the 'common predecessors' approach you outline. I think the topological sort may well work for me :)
MalcomTucker