views:

227

answers:

2

Hi,

What is best persistence approach/tool/library for a Directed Graph in C#. That is assuming I have a class model for a directed graph (e.g. Nodes & Relationships, or Vertex's and Edge's if you like), what would you recommend regarding persisting to a SQL Database? (or if you wish a 2nd question would be where I don't specify a SQL database as a requirement)

For example I was thinking I would simply go with a Relationships table and a Nodes table.

+1  A: 

A Nodes table and a Relationships (link) table seems the most natural to me since that's generally (always?) how you implement many-to-many relationships in a relational database, and a directed graph is basically a many-to-many relationship mapping for nodes (right?). I suppose you could store it in an XML field, or some binary value, but that really ignores the fact that you have a database.

If you didn't have a database, and were writing directly to some type of file, I would probably still use a similar mechanism of uniquely identifying each node with a key and separately defining how the key values are related.

This all assumes that your directed graph is many-to-many and not strictly one-to-many. If it were strictly one-to-many, without loops, I might use XML if writing to a file (utilizing the fact that child elements in the XML are related to their parent), or foreign keys if writing to a database (which would work well even if there were loops).

BlueMonkMN
+1  A: 

I'm not familiar with the tools available in C#, but your question seems to be mostly one about storage rather than C#.

How you store a DAG will no doubt depend on the use cases you have in mind. You can represent a DAG in two tables in a relational database, for example, where one table holds information about the nodes (A, B, C, etc.) and another holds information about the edges between the nodes (A -> B, A -> C, etc.).

A graph database such as Neo4j might also be a good way to go.

Your mileage will vary when it comes to things like horizontal scaling and concurrent access, depending on the approach you adopt. You may want to keep a denormalized representation around for speeding up some kinds of queries, but such a strategy involves tradeoffs you'll want to understand before going with it.

Eric W.