views:

61

answers:

2

I have a directed graph data structure, where I am trying to implement individual version control for each vertex. This creates some interesting scenarios, and I would much appreciate any ideas that you guys have. Specifically, I am looking to resolve the default behavior of the system when encountering the stated scenarios.

See the following image: Graph versions

Scenario 1: "The Null Pointer Paradox"

Vertex A is rolled back to version 1.0. Since this rollback will cascade down its subgraph, C will no longer be pointing to D. This could create a hazard. Should the behavior be to:

  • 1.1: Delete the edge C -> D, creating a broken graph
  • 1.2: Delete D, leaving E orphaned
  • 1.3: Delete D and E
  • 1.4: Refuse to perform the rollback before all edges pointing to D (which would be E -> D in this case) are deleted
  • 1.X: Alternative solutions?

Scenario 2: "Indirect Effects"

Vertex D is updated, so that the following holds:

  • D is now version 1.2
  • E is now version 1.1
  • C is now version 1.3
  • A is now version 1.3

Vertex A is now rolled back to version 1.2, so that the following holds:

  • A is now version 1.2
  • C is now version 1.2
  • D is now version 1.1

Should the default behavior be to:

  • 2.1: Roll back E to 1.0
  • 2.2: Refuse to roll back due to version hazard, in effect impairing functionality
  • 2.X: Alternative solutions?
+2  A: 

What you are handling is a very complex problem, and although I am not aware of focalized research projects specifically on this regard, I heard some attempts to manage the issue. it's basically impossible to pull out a quick and dirty solution. I can point you to previous questions I asked on this regard (graphs and version control):

What I ended up doing was refusing to perform any kind of revision control, but instead created new nodes with new identities every time. I lose rollback, I lose any kind of tracking, but I kept the thing manageable.

In practice, the only thing you could do is to have revision control for the graph as a whole. It's very hard to have it for individual nodes. All the issues you describe arise from the fact that you are crossing-through transaction layers, where in each layer you have a consistent graph as formed in a specific moment in time. If you cross-through these layers, allowing vertex revision control, you are dealing with a can of worms, Dune sized.

Stefano Borini
+1  A: 

It seems to me that there is some confusion about the granularity here. If you only version individual vertices but not the graph, then rolling back an individual vertex should not affect the rest of the graph. If, OTOH, you want the whole graph to be rolled back, then you should also version the whole graph.

The problem is that if you only version individual vertices, then you only have integrity guarantees for individual vertices, but not for the graph as a whole. So, if, as you describe it, rolling back an individual vertex "ripples through" the whole graph (or at least the connected subgraph), then you are not guaranteed to end up in a consistent state.

The research that seems to be closest to what you are trying, is about version control for XML, which, however, only deals with strongly typed trees (IOW degenerate graphs), not general graphs.

Jörg W Mittag
Apologies for not being more explicit.
Martin Källman
It is assumed here that edges belong to and are versioned with the vertex. Hence, by extension, a subgraph root node, such as A, is versioned according to the changes in its entire subgraph.As pointed out, this in effect creates not only nested, but multi-dimensional integrity planes (or layers), which can intersect. It is clear at this point that any possible solution to this problem is beyond my personal ability in mathematics.
Martin Källman
The only viable solutions I can see at this point are: A) Only version vertexes disregarding it's sub- and supergraphs. Essentially, value but not structure B) Downgrade to a tree structure C) A compromise, wherein version control contexts are explicitly set for any subgraph, but does not allow nested context; or, where a nested context is ignored by a higher-level context, thus somewhat decoupling the data structure and the version control structureWould you be able to point me to the research you mentioned?
Martin Källman