views:

217

answers:

4

I was just looking at Eric Lippert's simple implementation of an immutable binary tree, and I have a question about it. After showing the implementation, Eric states that

Note that another nice feature of immutable data structures is that it is impossible to accidentally (or deliberately!) create a tree which contains a cycle.

It seems that this feature of Eric's implementation does not come from the immutability alone, but also from the fact that the tree is built up from the leaves. This naturally prevents a node from having any of its ancestors as children. It seems that if you built the tree in the other direction, you'd introduce the possibility of cycles.

Am I right in my thinking, or does the impossibility of cycles in this case come from the immutability alone? Considering the source, I wonder whether I'm missing something.

EDIT: After thinking it over a bit more, it seems that building up from the leaves might be the only way to create an immutable tree. Am I right?

+7  A: 

If you're using an immutable data structure, in a strict (as opposed to lazy) language, it's impossible to create a cycle; as you must create the elements in some order, and once an element is created, you cannot mutate it to point at an element created later. So if you created node n, and then created node m which pointed at n (perhaps indirectly), you could never complete the cycle by causing n to point at m as you are not allowed to mutate n, nor anything that n already points to.

Yes, you are correct that you can only ever create an immutable tree by building up from the leaves; if you started from the root, you would have to modify the root to point at its children as you create them. Only by starting from the leaves, and creating each node to point to its children, can you construct a tree from immutable nodes.

Brian Campbell
@Brian, You could build an immutable tree starting from the root if each node pointed to its parent, instead of pointing to its children. (And each node would have to know if it was its parent's L or R child, in the case of a binary tree.)
LarsH
@LarsH I suppose you could do that, but you would never be able to traverse that tree in any useful sort of way.
Brian Campbell
@Brian on the contrary... Just keep a separate list of all nodes. (If your list must be immutable, build it up with cons as you go.) Granted you would find top-down traversal inefficient... but sometimes you don't care about that sort of traversal. Real-life example: your leaf nodes are indexed by value, and you need to be able to query their ancestry. Regardless of the application or usefulness, the point is, it's not true that you can only ever create an immutable tree by building up from the leaves. The question was a theory question, and in theory you can.
LarsH
+1 both Brian and Lars, well explained and well amended.
blesh
+1  A: 

You can't build it from the root, it requires you to mutate nodes you already added.

ergosys
@ergosys depends how you design your data structures... If each node has info about its parent instead of vice versa, you can build the tree from the root.
LarsH
@LarsH, true that, but it's a perfectly useless tree.
ergosys
@ergosys Not true. See comment to Brian.
LarsH
+1  A: 

When you say "built up from the leaves", I guess you're including the fact that the constructor takes children but never takes a parent.

It seems that if you built the tree in the other direction, you'd introduce the possibility of cycles.

No, because then you'd have the opposite constraint: the constructor would have to take a parent but never a child. Therefore you can never create a descendant until all its ancestors are created. Therefore no cycles are possible.

After thinking it over a bit more, it seems that building up from the leaves might be the only way to create an immutable tree. Am I right?

No... see my comments to Brian and ergosys.

For many applications, a tree whose child nodes point to their parents is not very useful. I grant that. If you need to traverse the tree in an order determined by its hierarchy, an upward-pointing tree makes that hard.

However for other applications, that sort of tree is exactly the sort we want. For example, we have a database of articles. Each article can have one or more translations. Each translation can have translations. We create this data structure as a relational database table, where each record has a "foreign key" (pointer) to its parent. None of these records need ever change its pointer to its parent. When a new article or translation is added, the record is created with a pointer to the appropriate parent.

A common use case is to query the table of translations, looking for translations for a particular article, or translations in a particular language. Ah, you say, the table of translations is a mutable data structure.

Sure it is. But it's separate from the tree. We use the (immutable) tree to record the hierarchical relationships, and the mutable table for iteration over the items. In a non-database situation, you could have a hash table pointing to the nodes of the tree. Either way, the tree itself (i.e. the nodes) never get modified.

Here's another example of this data structure, including how to usefully access the nodes.

My point is that the answer to the OP's question is "yes", I agree with the rest of you, that the prevention of cycles does come from immutability alone. While you can build a tree in the other direction (top-down), if you do, and it's immutable, it still cannot have cycles.

When you're talking about powerful theoretical guarantees like

another nice feature of immutable data structures is that it is impossible to accidentally (or deliberately!) create a tree which contains a cycle [emphasis in original]

"such a tree wouldn't be very useful" pales in comparison -- even if it were true. People create un-useful data structures by accident all the time, let alone creating supposedly-useless ones on purpose. The putative uselessness doesn't protect the program from the pitfalls of cycles in your data structures. A theoretical guarantee does (assuming you really meet the criteria it states).

P.S. one nice feature of upward-pointing trees is that you can guarantee one aspect of the definition of trees that downward-pointing tree data structures (like Eric Lippert's) don't: that every node has at most one parent. (See David's comment and my response.)

LarsH
I see your point, but this doesn't seem like a very useful form for a binary tree.
Odrade
@Brian, http://en.wikipedia.org/wiki/Tree_%28data_structure%29#Tree_representations says "There are many different ways to represent trees; common representations represent the nodes as records allocated on the heap (not to be confused with the heap data structure) with pointers to their children, *their parents*, or both, or as items in an array, with relationships between them determined by their positions in the array (e.g., binary heap)."
LarsH
+2  A: 

If you really want to try hard at it you could create a tree with cycles in it that is immutable. For example, you could define an immutable graph class and then say:

Graph g = Graph.Empty
  .AddNode("A")
  .AddNode("B")
  .AddNode("C")
  .AddEdge("A", "B")
  .AddEdge("B", "C")
  .AddEdge("C", "A");

And hey, you've got a "tree" with "cycles" in it - because of course you haven't got a tree in the first place, you've got a directed graph.

But with a data type that actually uses a traditional "left and right sub trees" implementation of a binary tree then there is no way to make a cyclic tree (modulo of course sneaky tricks like using reflection or unsafe code.)

Eric Lippert