Today, I had discussion with someone about Kruskal Minimum Spanning Tree algorithm because of page 13 of this slide.
The author of the presentation said that if we implement disjoint sets using (doubly) linked list, the performance for Make and Find will be O(1) and O(1) respectively. The time for operation Union(u,v) is min(nu,nv), where nu and nv are the sizes of the sets storing u and v.
I said that we can improve the time for the Union(u,v) to be O(1) by making the representation pointer of each member pointing a locator that contains the pointer to the real representation of the set.
In Java, the data structure would look like this :
class DisjointSet {
LinkedList<Vertex> list = new LinkedList<Vertex>(); // for holding the members, we might need it for print
static Member makeSet(Vertex v) {
Member m = new Member();
DisjointSet set = new DisjointSet();
m.set = set;
set.list.add(m);
m.vertex = v;
Locator loc = new Locator();
loc.representation = m;
m.locator = loc;
return m;
}
}
class Member {
DisjointSet set;
Locator locator;
Vertex vertex;
Member find() {
return locator.representation;
}
void union(Member u, Member v) { // assume nv is less than nu
u.set.list.append(v.set.list); // hypothetical method, append a list in O(1)
v.set = u.set;
v.locator.representation = u.locator.representation;
}
}
class Locator {
Member representation;
}
Sorry for the minimalistic code. If it can be made this way, than running time for every disjoint set operation (Make,Find,Union) will be O(1). But the one whom I had discussion with can't see the improvement. I would like to know your opinion on this.
And also what is the fastest performance of Find/Union in various implementations? I'm not an expert in data structure, but by quick browsing on the internet I found out there is no constant time data structure or algorithm to do this.