views:

317

answers:

3

I just studied the disjoint set data structure and I know that it is also called "union-find data structures", union and find are two main operations of this data structure. We can can perform union on disjoint sets, similarly we can perform find operations; I want to know what other operations we can perform on disjoint sets except union and find.

+1  A: 

Disjoint sets structure is also called "union-find structure". So union, find and MakeSet operations should be supported anyway. Other operations are not what this structure is all about, and whether they're supported depends on implementation and the aims you have. Sometimes you'll need to choose particular implementation specifically to fit your project's needs of additional opertations.

Other than that it would be nice if we supported the other basic set-related operations. Let's enumerate them:

  • intersection of two sets. Since sets are disjoint, it's always empty unless these two sets coincide.
  • union of two sets -- supported out of the box.
  • get an element from the set -- supported, it's most likely the result of find.
  • delete an element from the set -- depends on implementation. When sets are implemented as forests, it's tricky and requires slower additional operations. When sets are implemented as linked lists, it's simple.
  • enumerate the set, i.e. iterate each element in the given set. This one depends on implementation again: for linked lists it's simple, for forest-like implementation it requires additional structures to support.
Pavel Shved
+2  A: 

With the vanilla union-find data structure, you cannot enumerate the actual set so many of the set operations are not available. This is because in the vanilla version you only have pointers in one direction --- in the picture below, every diagonal line corresponds to an arrow upwards, and the roots (top line) are the root objects for the sets:

     o [set1]         o [Set2]
    / \                \
   o   o                o
  /
 o

So there is no way to find all the objects of, say, set 1; you can't trace paths to them from the root node, for instance. You could have pointers downwards also, but this complicates the data structure significantly because an object can have an arbitrary number of parents in the data structure.

So it's really mostly for the following operations:

  • Answer to: Do my objects A and B belong to the same set or not?
  • Merge sets S1 and S2 so that all objects in the sets are now considered to belong to the same set

To elaborate this, the union-set data structure has very low time complexity for the operations it supports; both merging sets and answering the same-set query take amortized constant time (O(1)); because of this very time complexity, you can't support all the set operations. For example, a standard set representation is by a [binary] search tree, where most operations have O(log n) complexities at least.

antti.huima
A: 

The point of the union-find disjoint set structure is not so much about performing elementary set operations, as your question and other respondents seem to be suggesting. Instead, it's about being a highly efficient implementation of the structure that certain algorithms need. In particular, finding connected components and minimum spanning trees build their most efficient implementations on top of union-find disjoint sets.

Novelocrat