views:

77

answers:

1
 1 Begin with a connected graph G containing edges of distinct weights, and an empty set of edges T
 2 While the vertices of G connected by T are disjoint:
 3   Begin with an empty set of edges E
 4   For each component:
 5     Begin with an empty set of edges S
 6     For each vertex in the component:
 7       Add the cheapest edge from the vertex in the component to another vertex in a disjoint component to S
 8     Add the cheapest edge in S to E
 9   Add the resulting set of edges E to T.
10 The resulting set of edges T is the minimum spanning tree of G.

From Wikipedia. I understand the outer loop is logV since you're joining sets. But then comes the internal loop.

If you use equivalence relations to keep track of the sets, that means you're only getting the element representing the set, so you can't determine the edge with the smallest weight between the two sets because you don't have all the elements. If you modify the structure to hold references to the children you still have to get all the children of each set. That means, worse case scenario, O(V/2) = O(V) for each set.

Afterwards, you still have to find the smallest edge connecting the two, which means going over all the edges connecting the two components. So you need to iterate over each node and see if its edge connects to an element in the other component, and if it does, if it's smaller than the minimum edge you currently have.

Meaning, an outer loop to iterate over the nodes and an inner loop to iterate over that nodes' edges - O(V*E). Since it's inside an O(logV) loop, you get O(logV*V*E).

Now, it seems as if you just have to iterate through all the edges, but how would you choose the minimum edge between the 2 components? I can tell if a given edge connects nodes in different components, but I can't tell which one connecting them has minimum weight. And if I get the one with the minimum weight, it might not connect them.

+1  A: 

If hash tables are allowed, then I see how it can be an O(Elog N) algorithm. Every component is stored as different hash set. Initially, each set contains a single node. The step of finding the minimum "bridges" for all components takes O(E) time, since we examine each edge at most twice, and we assume constant time lookup in the hash sets. Then we proceed by merging the sets, which takes O(N) time. Since the graph is connected, E>=N-1, so we have a total cost of O(E) per iteration.

--EDIT--

Following throwawayacct's comment,there is no need for hash structures at all. At each iteration we have a forest graph resulting from the previous iteration, so we can re-compute its connected components in O(E) time. This can be done for example by a simple DFS traversal from all nodes, that sets a unique "color" for each component. Then, when scanning the edges in order to find bridges, we only consider edges connecting nodes of different color.

Eyal Schneider
With O(|E|) time per phase, we can recompute the connected components from scratch every time – no hash table required.
@throwawayacct: thanks, you are right! I edited my response.
Eyal Schneider