views:

489

answers:

2

I have written a code that solves MST using Prim method. I read that this kind of implementation(using priority queue) should have O(E + VlogV) = O(VlogV) where E is the number of edges and V number of Edges but when I look at my code it simply doesn't look that way.I would appreciate it if someone could clear this up for me.

To me it seems the running time is this:

The while loop takes O(E) times(until we go through all the edges) Inside that loop we extract an element from the Q which takes O(logE) time. And the second inner loop takes O(V) time(although we dont run this loop everytime it is clear that it will be ran V times since we have to add all the vertices )

My conclusion would be that the running time is: O( E(logE+V) ) = O( E*V ).

This is my code:

#define p_int pair < int, int >

int N, M; //N - nmb of vertices, M - nmb of edges
int graph[100][100] = { 0 }; //adj. matrix
bool in_tree[100] = { false }; //if a node if in the mst
priority_queue< p_int, vector < p_int >, greater < p_int > > Q; 
/*
keeps track of what is the smallest edge connecting a node in the mst tree and
a node outside the tree. First part of pair is the weight of the edge and the 
second is the node. We dont remember the parent node beaceuse we dont need it :-)
*/

int mst_prim()
{
 Q.push( make_pair( 0, 0 ) );

 int nconnected = 0;
 int mst_cost = 0;
 while( nconnected < N )
 {
      p_int node = Q.top(); Q.pop();

      if( in_tree[ node.second ] == false )
      {
           mst_cost += node.first;
           in_tree[ node.second ] = true;

           for( int i = 0; i < N; ++i )
                if( graph[ node.second ][i] > 0 && in_tree[i]== false )
                     Q.push( make_pair( graph[ node.second ][i], i ) );

           nconnected++;
      }
 }

 return mst_cost;
}
A: 

I did not have to deal with the algorithm before, but what you have implemented does not match the algorithm as explained on Wikipedia. The algorithm there works as follows.

  1. But all vertices into the queue. O(V)
  2. While the queue is not empty... O(V)
    1. Take the edge with the minimum weight from the queue. O(log(V))
    2. Update the weights of adjacent vertices. O(E / V), this is the average number of adjacent vertices.
    3. Reestablish the queue structure. O(log(V))

This gives

        O(V) + O(V) * (O(log(V)) + O(V/E))
      = O(V) + O(V) * O(log(V)) + O(V) * O(E / V)
      = O(V) + O(V * log(V)) + O(E)
      = O(V * log(V)) + O(E)

exactly what one expects.

Daniel Brückner
I'm not quite sure about inserting all the vertices at the begining...How would one extract the vertex which is of the smallest weight and connects to the current mst(which we are building currently)?
sklitzz
The priority of each vertex is the cost of connecting the vertex with the current spanning tree - this is the minimum weight of all edges connecting the vertex with the current tree or infinity if there is no edge. All this values are initialized with infinity and updated each time a vertex is moved from the queue to the tree.
Daniel Brückner
+2  A: 

You can use adjacency lists to speed your solution up (but not for dense graphs), but even then, you are not going to get O(V log V) without a Fibonacci heap..

Maybe the Kruskal algorithm would be simpler for you to understand. It features no priority queue, you only have to sort an array of edges once. It goes like this basically:

  • Insert all edges into an array and sort them by weight
  • Iterate over the sorted edges, and for each edge connecting nodes i and j, check if i and j are connected. If they are, skip the edge, else add the edge into the MST.

The only catch is to be quickly able to say if two nodes are connected. For this you use the Union-Find data structure, which goes like this:

int T[MAX_#_OF_NODES]

int getParent(int a)
{
  if (T[a]==-1)return a;
  return T[a]=getParent(T[a]);
}
void Unite(int a,int b)
{
  if (rand()&1)
     T[a]=b;
  else
     T[b]=a;
}

In the beginning, just initialize T to all -1, and then every time you want to find out if nodes A and B are connected, just compare their parents - if they are the same, they are connected (like this getParent(A)==getParent(B)). When you are inserting the edge to MST, make sure to update the Union-Find with Unite(getParent(A),getParent(B)).

The analysis is simple, you sort the edges and iterate over using the UF that takes O(1). So it is O(E logE + E ) which equals O(E log E).

That is it ;-)

supo
I know Kruskals algorithm, but I wanted to understand this one also :-). If I use adjacency lists for Prim then I get: The while + for loop loops over all edges and inserts them into a heap. This should be then E*log(E). Is this the best complexity you can get with this approach(using a heap, not Fibbonaci heap)?
sklitzz
Yeah, you check each edge at most twice (from both nodes) and the queue has E edges at maximum, which results in O(E log E). But we do not write it like that, because constant factors are irrelevant in O() notation. So O(E log E) = O(E log (V^2)) = O(E * 2 log V) = O(E log V).
supo
This(the comment above) is the answer that I wanted, thank you I understand now :-)
sklitzz