I'm looking for a good data structure to build equivalence classes on nodes of a tree. In an ideal structure, the following operations should be fast (O(1)/O(n) as appropriate) and easy (no paragraphs of mystery code):
- (A) Walk the tree from the root; on each node --> child transition enumerate all the equivalent versions of the child node
- (B) Merge two equivalence classes
- (C) Create new nodes from a list of existing nodes (the children) and other data
- (D) Find any nodes structurally equivalent to node (i.e. they have the same number of children, corresponding children belong to the same equivalence class, and their "other data" is equal) so that new (or newly modified) nodes may be put in the right equivalence class (via a merge)
So far I've considered (some of these could be used in combination):
- A parfait, where the children are references to collections of nodes instead of to nodes. (A) is fast, (B) requires walking the tree and updating nodes to point to the merged collection, (C) requires finding the collection containing each child of the new node, (D) requires walking the tree
- Maintaining a hash of nodes by their characteristics. This makes (D) much faster but (B) slower (since the hash would have to be updated when equivalence classes were merged)
- String the nodes together into a circular linked list. (A) is fast, (B) would be fast but for the fact that that "merging" part of a circular list with itself actually splits the list (C) would be fast, (D) would require walking the tree
- Like above, but with an additional "up" pointer in each node, which could be used to find a canonical member of the circular list.
Am I missing a sweet alternative?