views:

2448

answers:

5

Hi there, I have been looking for C# examples to transform a DAG into a Tree.

Does anyone have an examples or pointers in the right direction?

Clarification Update

I have a graph that contains a list of modules that my application is required to load. Each module has a list of modules it depends on. For example here are my modules, A, B C, D and E

  • A has no dependencies
  • B depends on A, C and E
  • C depends on A
  • D depends on A
  • E depends on C and A

I want resolve dependencies and generate a tree that looks like this...

--A

--+--B

-----+--C

---------+--D

--+--E

Topological Sort

Thanks for the information, if I perform a Topological sort and reverse the output i will have the following order

  • A
  • B
  • C
  • D
  • E

I want to maintain the hierarchical structure so that my modules are loaded into the correct context, for example... module E should be in the same container as B

Thanks

Rohan

A: 

You can only do that if all subtrees have a single root node.

Mitch Wheat
You could always start with a fake root node.
ypnos
+1  A: 

It depends a lot on how you are representing your DAG. For example, it could be an adjacency matrix (A[i,j] = 1 if there's an edge from node i to node j, else 0), or as a system of pointers, or as an array of nodes and an array of edges....

Further, it's not clear what transformation you're trying to apply. A connected DAG is a tree, so I'm afraid you need to clarify your question a bit.

MarkusQ
To be a pedant: Any connected graph with no cycles is a tree.
Mitch Wheat
I see your pedantry and raise you: the set of all connected directed acyclic graphs is a proper subset of the set of all connected acyclic graphs, therefore your statement ("Any connected graph with no cycles is a tree") formally implies mine ("A connected DAG is a tree").
MarkusQ
Shouldn't it be "Any directed connected graph with no cycles CONTAINS a tree". If A depends on B and C; B depends on D and C depends on D, you don't have a tree, you have a graph that contains two trees.
Eclipse
+3  A: 

A DAG and a tree are not the same thing mathematically. Thus, any conversion introduces ambiguity. A tree by definition has no cycles, period.

dsimcha
Neither does a DAG (hence the Acyclic part of the name). A DAG does allow diamonds and similarly topologies where two divergent branches re-converge. (But in no case can you get back to a parent). A tree does not have any nodes with more than one parent.
Eclipse
Exactly: A DAG has no _directed_ cycles. A tree has no cycles at all.
dsimcha
+2  A: 

What you're looking for, in order to find the order to load your modules in, is the Topological sort of your DAG. If the edges go from a module to the modules it depends on (which I think is the most likely), you'll have to load the modules in the reverse order of the topological sort because a module will appear -before- all the modules on which it depends.

If you represent the DAG such that the edges go from the depended on modules to the modules that depend on them (you can get this by reversing all the edges in the graph above), you can just load the modules in the order of the topological sort.

arnsholt
I have updated the description regarding the Topological Sort, thanks
Rohan West
+2  A: 

There's the graph theoretical answer and the programmer's answer to this. I assume you can handle the programmers part yourself. For the graph theorethical answer:

  • A DAG is a set of modules where it never happens that A needs B, and at the same time, B (or one of the modules B needs) needs A, in modules-speak: no circular dependency. I've seen circular dependencies happen (search the Gentoo forums for examples), so you can't even be 100% sure you have a DAG, but let's assume you have. It it not very hard to do a check on circular dependencies, so I'd recommend you do so somewhere in your module loader.
  • In a tree, something that never can happen is that A depends on B and C and that both B and C depend on D (a diamond), but this can happen in a DAG.
  • Also, a tree has exactly one root node, but a DAG can have multiple "root" nodes (i.e. modules that nothing depends on). For example a program like the GIMP, the GIMP program will be the root node of the set of modules, but for GENTOO, almost any program with a GUI is a "root" node, while the libraries etc are dependencies of them. (I.E. both Konqueror and Kmail depend on Qtlib, but nothing depends on Konqueror, and nothing depends on Kmail)

The Graph theorethical answer to your question, as others pointed out, is that a DAG can't be converted to a tree, while every tree is a DAG.

However, (high-level programmers answer) if you want the tree for graphical representations, you're only interested in the dependencies of a specific module, not what's depending on that module. Let me give an example:

A depends on B and C
B depends on D and E
C depends on D and F

I can't show this as an ASCII-art tree, for the simple reason that this can't be converted into a tree. However, if you want to show what A depends on, you can show this:

A
+--B
|  +--D
|  +--E
+--C
   +--D
   +--F

As you see, you get double entries in your tree - in this case "only" D but if you do an "expand all" on the Gentoo tree, I guarantee you that your tree will have at least 1000 times the amount of nodes as there are modules. (there are at least 100 packages that depend on Qt, so everything Qt depends on will be present at least 100 times in the tree).

If you have a "large" or "complex" tree, It might be best to expand the tree dynamically, not in advance, otherwise you might have a very memory-intensive process.

The disadvantage of the tree above is that if you click open B, then D, you see that A and B depend on D, but not that also C depends on D. However, depending on your situation, this might not be important at all - if you maintain a list of loaded modules, on loading C you see that you have loaded D already, and it doesn't matter it wasn't loaded for C, but for B. It is loaded, that's all that matters. If you dynamically maintain what directly depends on a certain module, you can handle the opposite problem (unloading) as well.

However, what you can't do with a tree is what's in your final sentence: preserve topological order, i.e. if B gets loaded in the same container as C, you'll never get to load C in the same container as well. Or you might have to be put up with putting everything in one container (not that I fully understand what you mean with "loading into the same container")

Good luck!