views:

306

answers:

4

I have a DAG storing the relation between certain objects in my application. When this structure is updated by adding a new vertex below an existing one (i. e., implicitly creating a new edge into the new vertex) and then (at any later time) a new edge from there to other vertices, I want to ensure that the graph stays a DAG, i. e. that my code does not create cycles.

Do I have to add a cycle detection to each inserting and connecting operation, or are there rules that I can follow on insertion which will guarantee that I'm not producing cycles?

One approach that I can think of is to store the topological level of each node and only allow new edges that point to higher levels (further away from the source nodes). However, it seems that this will actually rob me of a lot of the flexibility I was hoping to achieve by using a DAG instead of a set of ordinary trees.

+1  A: 

What you can do is to keep the nodes ordered in a topological ordering (search for "topological sort"). When you add an arc from a lower-order to higher-order node you know that no cycle was created. In the reverse case, you need incrementally update your topological ordering and at the same time run cycle detection.

antti.huima
If I topologically sort the graph, and the sort works, it means there is no cycle, right?
Hanno Fietz
True; if you maintain the ordering as a tree, you get O(log N) time complexity per edge insert.
jpalecek
+2  A: 

You could also store reverse links and just check that the terminus node of the edge being added does not appear in any of the parent nodes of the origin node. This would be faster than doing a full cycle detection. Essentially this would be a shortest path algorithm on the reverse links, which for a DAG ought to be a linear operation.

As @Markus says, though, if you aren't creating links both to and from the new node to existing nodes, you shouldn't be able to create a cycle by introducing a new node to the graph.

tvanfosson
That sounds like the most pragmatic option, but I'll have to check what that's going to do to my space requirements and whether that's going to matter.
Hanno Fietz
I doubt the efficiency of this in general cases (imagine a five level graph where most of the nodes in each level point to most of the nodes in the next). The reverse pointer idea may work in some cases, but partial ordering should be much faster in general.
MarkusQ
@MarkusQ -- I understood that he wanted to avoid the partial ordering, if possible.
tvanfosson
@tvanfossom, @MarkusQ - I can't really say I was trying to avoid partial ordering -- in fact, I'm just now trying to understand what it is about and how it helps me. The reverse pointers were easy to grok, which made them seem more obvious to me.
Hanno Fietz
My mistake. Using a partial ordering you would have a marker for every node pair (u,v) such that there is a path from u to v in the graph. To add an edge (a,b), you'd check if (b,a) is in your ordering. If not you can't introduce a cycle with (a,b).
tvanfosson
...in a sense, it is the same thing, except you potentially trade space for time since you precompute whether an inverse path exists. I don't have a feel for how complex updating the ordering would be. Worst case it may be O(N^2) if you connect one half of the graph to the other.
tvanfosson
+3  A: 

When this structure is updated by adding a new vertex and then a new edge from there to other vertices

If all of the new edges are from the new vertex you won't ever create cycles.

If you're also going to be adding edges to the new vertex from older nodes, your options depend on the expected shape of the graph. They all boil down to variations on the partial ordering, but there are hacks that give better performance for trees, forests, diamond-meshes, etc. What do you know about the expected overall graph shape?

MarkusQ
Thanks for pointing this out, I clarified my question accordingly. As for the shape, I don't know yet. A lot of the shape is going to depend on how people are going to use the application. I am expecting patterns, but I can't say which these are going to be, yet.
Hanno Fietz
I'm looking at partial ordering right now, but I can't quite seem to grok how that helps me. Sure, there's order in my graph, and I can sort it, but how is that preventing cycles? Is this the same road as the topological sorting antti.huima suggests?
Hanno Fietz
Also, as for the shape I'm expecting, it's probably safe to say that it's going to be trees, but they share many branches (I was trying to figure out how to keep multiple trees synchronized when I came upon idea of using a single DAG instead)
Hanno Fietz
+1  A: 

In this article, there is an online algorithm for maintaining topological ordering (page 4).

jpalecek
+1, looks interesting.
j_random_hacker