views:

373

answers:

4

I am using depth-first search to identify paths in a directed weighted graph, while revisiting nodes that belong to a cycle, and setting cutoff conditions based on total distance traveled, or stops from the source node.

As I understand, with recursion an explicit stack structure is not required for depth first search, so I was wondering if I could further simplify my code below by somehow doing without the explicit stack:

public class DFSonWeightedDirectedGraph {

    private static final String START = "A";
    private static final String END = "E";
    private int pathLength = 0;
    private int stops = 0;

    public static void main(String[] args) {
        //this is a directed weighted graph
        WeightedDirectedGraph graph = new WeightedDirectedGraph();
        graph.addEdge("A", "B", 15);
        graph.addEdge("A", "D", 15);
        graph.addEdge("A", "E", 27);
        //(...) more edges added


        Stack<String> visited = new Stack<String>();        
        visited.push(START);

        new DFSonWeightedDirectedGraph().depthFirst(graph, visited);
    }

    private void depthFirst(WeightedDirectedGraph graph, Stack<String> visited) {

        Collection<Map.Entry<String, Integer>> tree_of_children 
            = graph.get_tree_of_children(visited.peek());

        for (Map.Entry<String, Integer> child : tree_of_children) {


            if(pathLength + child.getValue()>= 20){
                continue;
            }       


            visited.push(child.getKey());
            pathLength += child.getValue();
            stops += 1;         

            if (child.getKey().equals(END)) {
                printPath(visited);
            }

            depthFirst(graph, visited); 
            visited.pop();  
            pathLength -= child.getValue();
            stops -= 1; 
        }
    }

    private void printPath(Stack<String> visited) {
        for (String node : visited) {
            System.out.print(node);
            System.out.print(" ");
        }
        System.out.println("[path length: "+pathLength +
                " stops made: " + stops +"]");
    }
}

However, other recursive implementations without an explicit stack structure usually take into account already visited nodes, by coloring them white, gray or black. So, in my case where revisiting is allowed, and the path needs to be recorded, is an explicit stack absolutely required? Thanks for any suggestions of simpler alternatives.

+1  A: 

If you have to save the path, you need a data structure for this. Your stack is OK; you could replace it with another data structure, but not get rid of it.

If it would be OK to directly print the path (and not record it), you do not need a stack. Then you can change the method signature to just get the graph and the actual node (and perhaps the actual path length and the "stops").

Arne
Thanks that makes a lot of sense. So basically the reasoning is that if storage is required to save the path for further use, then regardless of the implementation extra space is required
denchr
A: 

Edit: This answer is completely off topic and was posted based on misinterpreting the question.

There are several things wrong with your DFS implementation. Yes, it visits all nodes in a depth-first manner and it does eventually manage to find a path between START and END, but it does not attempt to check for already visited nodes and keeps a stack for no real reason. The only reason you don't fall into infinite recursion on cycles is because you limit the maximum path length, and you will still take a long time on graphs that have multiple distinct paths between all pairs of vertices.

The only thing you are using the stack for is to pass the node to be visited next to the dfs function. You can simply get rid of the stack and pass the node directly.

So, instead of

private void depthFirst(WeightedDirectedGraph graph, Stack<String> visited) {
   ...
   visited.push(child);
   ...
   depthFirst(graph, visited);

You can simply write this as

private void depthFirst(WeightedDirectedGraph graph, String node) {
   ...
   //visited.push(child); <-- No longer needed
   ...
   depthFirst(graph, child);

You are using a data structure (stack) that you have named 'visited' and yet you do not use that to store/mark which nodes have been already visited to avoid revisiting.

You can modify your existing code to have a Set called visited (make it a global/class variable or pass it along recursive calls as you did with your stack) where you keep all nodes already visited and only call depthFirst() on those nodes that are not already in that Set.

That should make your code look something like this

private void depthFirst(WeightedDirectedGraph graph, String node, Set<String> visited) {
   visited.add(node); // mark current node as visited
   ...
   //visited.push(child); <-- No longer needed
   ...
   if (!visited.contains(child)){ // don't visit nodes we have worked on already
      depthFirst(graph, child);
   }

So far my answer has been to try to modify your code to make it work. But it appears to me that you need to get a better grasp of what a DFS actually is and how it really works. Reading up the relevant chapter on any good Algorithm/Graph Theory book would help you greatly. I would recommend CLRS (it has a very nice chapter on simple graph traversals), but any good book should do. A simple and correct recursive DFS can be implemented in a much simpler manner using arrays, without having to resort to stacks or sets.

Edit: I did not mention how you could retrieve the path after replacing the stack. This can be easily done by using a Map that stores the parent of each node as it is explored. The path (if any is found) can be obtained using a recursive printPath(String node) function, that prints the node passed to it and calls itself again on its parent.

MAK
The problem context in which I am using this approach is mentioned throughout my post. I do *not* have a requirement for keeping track of visited nodes.
denchr
Keeping track of visited nodes is an vital part of DFS (unless you have a graph with too many nodes to manage), regardless of whether the rest of the program needs it or not. Otherwise you end up following lots of paths to the same nodes that are not really needed, unless you need to check all possible paths. That is the reason why I suggested going through CLRS. Right now, you are just avoiding cycles by limiting maximum path length. That's not really a solution, you can still run into problems with large graphs that have lots of paths between the same nodes. If you are happy with it, great.
MAK
Thanks for clarifying. By following the requirements I have to find the number of different paths from node X to X with a distance of less than a given number. So, the solutions are looking something like: XDX, XEBX, XEBXDX, XDXEBX, XDEBX, XEBXEBX, XEBXEBXEBX. Which means I have to dive into a cycle, for instance XEBX, but also be able to jump out of it, based on the cut-off condition. Would be very interested to know better approaches to the one I am taking now, i.e. allowing revisiting of nodes.
denchr
I see, in that case the solution you now have is pretty much what you need given your requirements. I missed that it is part of your requirements and not an artifact of your implementation. Sorry about that.
MAK
Sure, no worries, and thanks again. Actually, my initial question was about storing paths and if extra space in unavoidable. I guess my confusion was between storing a path to a data structure and replacing a stack with recursion for the body of the algorithm. So the core concept and storage of the paths are different from each other and can both use stacks explicitly. The former can be replaced by recursion, but the latter is needed if I need to save paths rather than just print the sequence of visited nodes to the console. Any further thoughts on that?
denchr
Nope, I really don't see any way getting rid of the stack.There does not seem to be any way of getting around the fact that you need all nodes in the current path in order to print it. A stack is the best data structure to use for the that. Actually, if you are really into saving space, I suggest using an explicit stack based DFS. By using recursion you are effectively using 2 stacks (one explicit for the path, the other used in the background for recursion).
MAK
This sounds promising and makes sense. Since I need to store the paths, I might as well use only one explicit stack with an iterative rather than recursive implementation, which will also keep track of the path. Will give that a try.
denchr
A: 

Just add an extra field to the node structure, that is a "visited" field. This will be fastest. You do have to unmark all nodes afterwards (or before you do the search).

Or, just hash the id of the node in a hashtable. This will be faster to check than a stack. If you don't have an id for the node, it is a good idea to create one, to help with debugging, output, etc.

You do need extra space, but adding a boolean field to each node will require the least space, since it will be 1 bit per node, vs. 1 pointer per node for a stack.

You don't really need a distance cut-off, since you are searching a finite graph and you only visit each node once, so you will visit at most N nodes in an N-node graph. You would need a depth cutoff if you were searching an infinite space, such as when doing a state-space search (an example is a prolog interpreter searching for a proof).

Larry Watanabe
A: 

You don't need the visited nodes. Just pass your current child node to the recursive method instead of the visited nodes parameter, and use the return value for carrying the path.

If you can process the path element by element, i.e. rewrite printPath() so that it can be called once per element just the key type is required as return type. If you want to receive the whole path you need a list of key values as return type.

Actually, you're relatively close to the solution. Just use the call stack of the recursive method calls to represent the path.

Wolfgang