views:

143

answers:

2

I am searching for a non-recursive depth first search algorithm on graphs in Pascal (Delphi).

I need DFS for computing strongly or bi-connected components of large graphs. Currently I am using a recursive variant of the algorithm: http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm

The problem is that for such algorithm I must define a large amount of memory to be used for a stack and that makes later problems in Windows 7, where Open and Save Dialogs do not work because of several threads generated....

So again: I do not see how to rewrite the Tarjan DFS algorithm to work without recursion. Do you have any suggestion - or point to a non recursice algorithm for depth first search on graphs?

Thanks.

A: 

While I don't have access to your data set, it's rather common to have accidental incorrect recursion that doesn't find the base case in all the correct circumstances, or else there may be a cycle in your graph. I would check these two things before moving on. For instance, are there more function descents than you have nodes in the tree?

Barring that, your data set might just be so large as to be overflowing the process stack. If this is the case, I would recommend writing an iterative version of the algorithm that uses the stack that you provide to it. The stack should live in heap-space, not stack space. You'll need to keep the context of the search on your own rather than letting the algorithm do it.

This algorithm is a recursive algorithm. Period. You don't need to write a function that calls itself, but in the end you'll still need to keep track of where you've been, and the order your visited nodes.

San Jacinto
He needs to make it non-recursive to avoid excessive stack usage. I.e. he needs to use an explicit stack. That is, his question is *how* to do exactly what you suggest he does.
Barry Kelly
@Barry I assume that if a person understands recursion that they understand how to make a stack complete with push/pop operations. I don't think he may have understood that you usually run out of process stack space before heap space, in addition to the fact that mathematically, this IS A RECURSIVE ALGORITHM. There is no escaping it. To me, this was the missing key. And since he hasn't given correspondence in 4 days, it's a little tough to give more information based off of his feedback. I'm sorry you disagree, and I hope the downvote made you feel better about things.
San Jacinto
Barry, thanks. That is actually what I needed,
Petra
@Petra Then why not ask specifically for it, at least after you have an "answer" posted? :)
San Jacinto
He was running out of virtual address space, because he had (in another question) dialed up the stack reservation towards 60M, and was then hitting problems with threads created by the shell controls in the FileOpen dialog etc.
Barry Kelly
Also, the term **recursion** is ambiguous here; it may refer to the algorithmic notion (which may or may not be inherent to the problem), or it may refer to the implementation-level notion of using the implicit call stack to store state. "large amount of memory" etc. should be a clue that he's looking to eliminate the second notion of recursion, not the first. Pointing out the first is just pedantry.
Barry Kelly
@Petra - you should accept answers by selecting the big checkmark to the left of the answer, below the score. I'd advise you to read the FAQ.
Barry Kelly
@Barry You are correct, it is pedantry in some cases. However, I didn't see the original question that you refer to, and I don't know his programming skills. Also, he never shares _how_ he is delcaring is stack: on the process stack or on the heap? Coupling this unknown with the fact that an iterative version of the same algorithm will consume the same amount of memory (asymptotically speaking), I don't agree with how you expect me to be clairvoyant in regard to knowing exactly what he was searching for. Oh well, to each his own. Enjoy your day.
San Jacinto
+1  A: 

The algorithm as described on Wikipedia looks to reasonably easily be made non-recursive with an explicit stack. Starting out with that (included here for reference, in case Wikipedia changes):

Input: Graph G = (V, E)

index = 0                                         // DFS node number counter 
S = empty                                         // An empty stack of node
for all v in V do
  if (v.index is undefined)                       // Start a DFS at each node
    tarjan(v)                                     // we haven't visited yet

procedure tarjan(v)
  v.index = index                                 // Set the depth index for v
  v.lowlink = index                               // SHOULD BE v.lowlink = MAX_INT?
  index = index + 1
  S.push(v)                                       // Push v on the stack
  for all (v, v') in E do                         // Consider successors of v
    if (v'.index is undefined)                    // Was successor v' visited?
        tarjan(v')                                // Recurse
        v.lowlink = min(v.lowlink, v'.lowlink)
    else if (v' is in S)                          // Was successor v' in stack S? 
        v.lowlink = min(v.lowlink, v'.index )     // v' is in stack but it isn't in the dfs tree
  if (v.lowlink == v.index)                       // Is v the root of an SCC?
    print "SCC:"
    repeat
      v' = S.pop                                  
      print v'
    until (v' == v)

Step 1: Remove loops containing recursion, adding labels and gotos. This is necessary to make loop variables explicit, savable and restorable (needed during recursion-simulation with stacks). A label needs to be added after tarjan()'s return, as we'll jump to it in a moment.

procedure tarjan(v)
  v.index = index                                 // Set the depth index for v
  v.lowlink = index                               // SHOULD BE v.lowlink = MAX_INT?
  index = index + 1
  S.push(v)                                       // Push v on the stack
  succ = all (v, v') in E      // Consider successors of v
  succIndex = 0                // presume succ is 0-based
loop_top:
  if succIndex >= Length(succ) goto skip_loop
  v' = succ[succIndex]
  if (v'.index is undefined)                    // Was successor v' visited?
      tarjan(v')                                // Recurse
recursion_returned:
      v.lowlink = min(v.lowlink, v'.lowlink)
  else if (v' is in S)                          // Was successor v' in stack S? 
      v.lowlink = min(v.lowlink, v'.index )     // v' is in stack but it isn't in the dfs tree
  succIndex = succIndex + 1
  goto loop_top
skip_loop:
  if (v.lowlink == v.index)                       // Is v the root of an SCC?
    print "SCC:"
    repeat
      v' = S.pop                                  
      print v'
    until (v' == v)

Step 2: Introduce a stack which contains all the relevant state for storing our position and computation in the loop at any point where we may be returning from recursion, or starting out at the top of the loop.

The stack:

T = empty // T will be our stack, storing (v, v', succ, succIndex, state)

state is an enumeration (TopState, ReturnedState) encoding the location in the procedure. Here's the procedure rewritten to use this stack and state rather than recursion:

procedure tarjan(v)
  while (T is not empty) do
    (v, v', succ, succIndex, state) = T.pop
    case state of
      TopState: goto top
      ReturnedState: goto recursion_returned
    end case
top:
    v.index = index                                 // Set the depth index for v
    v.lowlink = index                               // SHOULD BE v.lowlink = MAX_INT?
    index = index + 1
    S.push(v)                                       // Push v on the stack
    succ = all (v, v') in E      // Consider successors of v
    succIndex = 0                // presume succ is 0-based
loop_top:
    if succIndex >= Length(succ) goto skip_loop
    v' = succ[succIndex]
    if (v'.index is undefined)                    // Was successor v' visited?
      // instead of recursing, set up state for return and top and iterate
      T.push(v, v', succ, succIndex, ReturnedState) // this is where we return to
      T.push(v', empty, empty, empty, TopState) // but this is where we go first
      continue // continue the while loop at top
recursion_returned:
      v.lowlink = min(v.lowlink, v'.lowlink)
    else if (v' is in S)                          // Was successor v' in stack S? 
      v.lowlink = min(v.lowlink, v'.index )     // v' is in stack but it isn't in the dfs tree
    succIndex = succIndex + 1
    goto loop_top
skip_loop:
    if (v.lowlink == v.index)                       // Is v the root of an SCC?
    print "SCC:"
    repeat
      v' = S.pop                                  
      print v'
    until (v' == v)

Step 3: Finally, we need to make sure the entry conditions are correct, for the top-level code which calls tarjan. That can easily be done by an initial push:

procedure tarjan(v)
  T.push(v, empty, empty, empty, TopState)
  while (T is not empty) do
    (v, v', succ, succIndex, state) = T.pop
    case state of
      TopState: goto top
      ReturnedState: goto recursion_returned
    end case
top:
    v.index = index                                 // Set the depth index for v
    v.lowlink = index                               // SHOULD BE v.lowlink = MAX_INT?
    index = index + 1
    S.push(v)                                       // Push v on the stack
    succ = all (v, v') in E      // Consider successors of v
    succIndex = 0                // presume succ is 0-based
loop_top:
    if succIndex >= Length(succ) goto skip_loop
    v' = succ[succIndex]
    if (v'.index is undefined)                    // Was successor v' visited?
      // instead of recursing, set up state for return and top and iterate
      T.push(v, v', succ, succIndex, ReturnedState) // this is where we return to
      T.push(v', empty, empty, empty, TopState) // but this is where we go first
      continue // continue the while loop at top
recursion_returned:
      v.lowlink = min(v.lowlink, v'.lowlink)
    else if (v' is in S)                          // Was successor v' in stack S? 
      v.lowlink = min(v.lowlink, v'.index )     // v' is in stack but it isn't in the dfs tree
    succIndex = succIndex + 1
    goto loop_top
skip_loop:
    if (v.lowlink == v.index)                       // Is v the root of an SCC?
    print "SCC:"
    repeat
      v' = S.pop                                  
      print v'
    until (v' == v)

It could also be done by a jump, jumping immediately to top. The code can be further cleaned up, perhaps converted to use a while or repeat loop to eliminate some of the gotos, etc., but the above should be at least functionally equivalent, eliminating the explicit recursion.

Barry Kelly