



I'm trying to implement an iterative version of Tarjan's strongly connected components (SCCs), reproduced here for your convenience (source:

Input: Graph G = (V, E)

index = 0                         // DFS node number counter 
S = empty                         // An empty stack of nodes
forall v in V do
  if (v.index is undefined)       // Start a DFS at each node
    tarjan(v)                     // we haven't visited yet

procedure tarjan(v)
  v.index = index                 // Set the depth index for v
  v.lowlink = index
  index = index + 1
  S.push(v)                       // Push v on the stack
  forall (v, v') in E do          // Consider successors of v
    if (v'.index is undefined)    // Was successor v' visited?
        tarjan(v')                // Recurse
        v.lowlink = min(v.lowlink, v'.lowlink)
    else if (v' is in S)          // Was successor v' in stack S? 
        v.lowlink = min(v.lowlink, v'.lowlink )
  if (v.lowlink == v.index)       // Is v the root of an SCC?
    print "SCC:"
      v' = S.pop
      print v'
    until (v' == v)

My iterative version uses the following Node struct.

struct Node {
    int id; //Signed int up to 2^31 - 1 = 2,147,483,647
    int index;
    int lowlink;        
    Node *caller;                    //If you were looking at the recursive version, this is the node before the recursive call
    unsigned int vindex;             //Equivalent to the iterator in the for-loop in tarjan
    vector<Node *> *nodeVector;      //Vector of adjacent Nodes 

Here's what I did for the iterative version:

 void Graph::runTarjan(int out[]) {  //You can ignore out. It's a 5-element array that keeps track of the largest 5 SCCs
        int index = 0;
tarStack = new stack<Node *>();
    onStack = new bool[numNodes];
  for (int n = 0; n < numNodes; n++) {
    if (nodes[n].index == unvisited) {
      tarjan_iter(&nodes[n], index);

void Graph::tarjan_iter(Node *u, int &index) {
    u->index = index;
    u->lowlink = index;
    u->vindex = 0; 
    u->caller = NULL;           //Equivalent to the node from which the recursive call would spawn.
    onStack[u->id - 1] = true;
    Node *last = u;
    while(true) {
        if(last->vindex < last->nodeVector->size()) {       //Equivalent to the check in the for-loop in the recursive version
            Node *w = (*(last->nodeVector))[last->vindex];
            last->vindex++;                                   //Equivalent to incrementing the iterator in the for-loop in the recursive version
            if(w->index == unvisited) {
                w->caller = last;                     
                w->vindex = 0;
                w->index = index;
                w->lowlink = index;
                onStack[w->id - 1] = true;
                last = w;
            } else if(onStack[w->id - 1] == true) {
                last->lowlink = min(last->lowlink, w->index);
        } else {  //Equivalent to the nodeSet iterator pointing to end()
            if(last->lowlink == last->index) {
                Node *top = tarStack->top();
                onStack[top->id - 1] = false;
                int size = 1;

                while(top->id != last->id) {
                    top = tarStack->top();
                    onStack[top->id - 1] = false;
                insertNewSCC(size);  //Ranks the size among array of 5 elements

            Node *newLast = last->caller;   //Go up one recursive call
            if(newLast != NULL) {
                newLast->lowlink = min(newLast->lowlink, last->lowlink);
                last = newLast;
            } else {   //We've seen all the nodes

My iterative version runs and gives me the same output as the recursive version. The problem is that the iterative version is slower, and I'm not sure why. Can anyone give me some insight on my implementation? Is there a better way to implement the recursive algorithm iteratively?


PS Please let me know whether this question is appropriate for this forum.

+3  A: 

A recursive algorithm uses the stack as storage area. In the iterative version, you use some vectors, which themselves rely on heap allocation. Stack-based allocation is known to be very fast, since it is only a matter of moving an end-of-stack pointer, whereas heap allocation may be substantially slower. That the iterative version is slower is not fully surprising.

Generally speaking, if the problem at hand fits well within a stack-only recursive model, then, by all means, recurse.

Thomas Pornin
The issue I have is that the recursive version causes a stack overflow on very large graphs (900,000 nodes, 5,000,000 edges). I know that I can change the stack size, and when I set it to unlimited, it works on these large graphs; however, I'm not allowed to change the stack size as this is being written for a project in a class.
Teef L
+1. Just make sure it fits on the stack :)
Billy ONeal
I agree that it's not necessarily surprising that the iterative version is slower. But apparently it's surprising to the teaching assistants in this class, and they have suggested that my implementation might be wrong? I know it's not wrong because it correctly outputs the right SCCs for the 10 sample graphs that they provided. I guess I should have written that my question is basically, is there something inherently inefficient in my implementation? If so, how should I change it?
Teef L
Also, the stack for this project is limited to 10MB. My implementation will be run on arbitrarily large graphs, sizes of which are not known to me. Won't I hit a limit on how much I can reduce what's put on the stack? >_>
Teef L
@mathee, If you want to support arbitrary data sizes, you will hit the limit eventually. Thus the comments a la *"if it fits on the stack"*.
Georg Fritzsche
@gf, Ah, I see!
Teef L
It looks like your iterative code uses an "artificial stack". As such it should have the same computational complexity and memory coherency as the recursive algorithm. To optimise it, I would use a conventional profiling approach to find out if there are any unexpected hotspots.
This algo is supposed to be linear in terms of # of nodes and edges. I would call that fast. Are you doing something wrong perhaps?
Hamish Grubijan