I have a DAG. I have this operation to add a edge between two nodes.
If A is reachable from B, then B is A's parent. If A is reachable from B without going though another node, then B is A's direct parent.
Requirements for this graph are:
- No cycles.
- For any node, there is a list of direct parents P[1],P[2],P[3]... P[i] is not a parent of P[j] for any i and j.
If adding a edge, requirement 1 is not met, the edge is not constructed. If adding a edge, requirement 2 is not met, the edge is constructed, but the direct parents will be modified in a way such that requirement 2 is met.
For example, there are 3 nodes
- A, direct parents: none
- B, direct parents: A
- C, direct parents: A
now, if I add a edge between B and C, we have
- C, direct parents: A,B
but A is a parent of B, doesn't meet requirement 2, thus A is removed from C's direct parent, and we have
- C, direct parents: B
Currently here is what I did: Add edge from A to B (this A become B's parent)
- Check if B is A's parent by BFS. If so, do not add the edge.(this make sure there is no cycles)
- Check if A is already B's parent by BFS. If so, do not add the edge.
- Find the intersection of A's parent with B's direct parent. This is done by find every parent of A though BFS. Remove the intersection from B's direct parent, and add A as a direct parent of B.(2 and 3 make sure it meet requirement 2)
This is slow. It breaks down at 5k node level(I'm looking for this to handle any graph with less than 100k nodes), the speed become not acceptable, 0.02 second for adding a node edge.
I have a feeling step 1 and 2 can be done in 1 step with some other algorithm.
I thought of using topological ordering, but it have to transverse the entire graph, which is the worst case for my step 1&2. The ordering will be disrupted when the new node is added. so I have to run a topological ordering every time for a insert, so it doesn't create any benefit. For step 3, the entire set of A's parents have to be found. The process is fairly slow, as on average it transverse a decent part of the graph.
How can I make this more efficient?