views:

398

answers:

6

This is not homework, and I don't need to answer it, but now I have become obsessed :)

The problem is:

  • Design an algorithm to destructively flatten a binary tree to a linked list, breadth-first. Okay, easy enough. Just build a queue, and do what you have to.
  • That was the warm-up. Now, implement it with constant storage (recursion, if you can figure out an answer using it, is logarithmic storage, not constant).

I found a solution to this problem on the Internet about a year back, but now I've forgotten it, and I want to know :)

The trick, as far as I remember, involved using the tree to implement the queue, taking advantage of the destructive nature of the algorithm. When you are linking the list, you are also pushing an item into the queue.

Each time I try to solve this, I lose nodes (such as each time I link the next node/add to the queue), I require extra storage, or I can't figure out the convoluted method I need to get back to a node that has the pointer I need.

Even the link to that original article/post would be useful to me :) Google is giving me no joy.

Edit:

Jérémie pointed out that there is a fairly simple (and well known answer) if you have a parent pointer. While I now think he is correct about the original solution containing a parent pointer, I really wanted to solve the problem without it :)

The refined requirements use this definition for the node:

struct tree_node
{
  int value;
  tree_node* left;
  tree_node* right;
};
+2  A: 

First of all, I'll assume your nodes have a "parent" field that points to their parent. It's either that, or you need a stack to be able to move back up in the tree (and thus can't achieve that O(1) auxiliary memory requirement).

There is a well-known in-order iteration which is both iterative and in O(1) space. Suppose for instance you'd want to print items in order. Basically, when you traverse a binary tree, you have to decide at any given moment, in any given node, whether you want to move UP (to the parent), LEFT (to the left child) or RIGHT. The idea for this algorithm is to base this decision on where you come from:

  1. if you come down from the parent, then clearly you are visiting the node for the first time, so you go LEFT;
  2. if you come up from the left child, then you have visited all nodes that are smaller than the current node, so you can print (remember we want to print nodes in order here) the current node, and then go RIGHT;
  3. finally, if you come up from the right child, that means you have visited the entire subtree rooted at this particular node, and so you can back UP to the parent.

OK: so this is the algorithm that you take for a base. Now, clearly, if you are destructively modifying this into a linked list, you should only modify a node once you aren't going to visit it anymore. That is when you are coming up from the right. At that point, you have to decide what the successor of that node is going to be... ?

Well, you need to keep track, at all times, of two pointers: one to the smallest node you have visited, and one to the largest node you have visited in the current rooted subtree. You use the smallest as the successor for nodes you last visit when you come up from the right child, and use the largest as the successor for nodes you last visit coming up from somewhere else, because they happen to not have a right child!

EDIT 1: I forgot to say that I implicitly consider that the "parent" field in the binary tree becomes the "next" field in the linked list---that is what I modify in the last step.

EDIT 3: The following part of my answer has turned out to not quite answer the original question (but what preceded is still pertinent).


EDIT 2: Following Svante's understandable wish that I explicit my suggestion of using left rotations into some code:

#include <stdlib.h>
#include <stdio.h>

typedef struct node *node;

struct node
{
  int value;
  node left;
  node right;
};

node new_node(int value, node left, node right)
{
    node n = (node) malloc(sizeof(struct node));
    n->value = value; n->left = left; n->right = right;
    return n;
}

int rotate_right(node tree)
{
    if(tree != NULL && tree->left != NULL)
    {
        node
            a = tree->left->left,
            b = tree->left->right,
            c = tree->right;
        int tmp = tree->value;
        tree->value = tree->left->value;
        tree->left->value = tmp;

        tree->left->left = b;
        tree->left->right = c;
        tree->right = tree->left;

        tree->left = a;
        return 1;
    }
    return 0;
}

The rotation function is not elegant, but as it is easy to get mixed up, I tried to follow the exact same naming as used in Wikipedia's article on rotations. The nodes A, B, C are named as such in my code; the nodes P and Q are not, and since I chose not to use pointers of pointers, I instead resorted to the trick of switching values of P and Q---in lieu of switching places. With "rotation_right" at our disposal, the conversion algorithm is quite simple:

void convert_to_list(node tree)
{
    if(tree != NULL) {
        while(rotate_right(tree) != 0);
        convert_to_list(tree->right);
    }
}

The resulting tree is a sorted linked-list, where the next pointer of the list is what used to be the right pointer in the tree. Finally, here is a test program:

int main()
{
    node t =
        new_node(4,
              new_node(2, NULL, new_node(3, NULL, NULL)),
              new_node(8, new_node(5, NULL, NULL), NULL));
    convert_to_list(t);
    for(; t != NULL; t = t->right)
        printf("%d, ", t->value);
    return 0;
}
Jérémie
@Jérémie: I was attempting to solve this problem without the parent pointer. I swear there was a way to do it! :) If I can't get an answer that satisfies this extra requirement, I'll award you the answer.
Merlyn Morgan-Graham
@Jérémie: You get an upvote either way. Thanks for the help so far!
Merlyn Morgan-Graham
@Merlyn: Thanks! It didn't even occur to me that this could be done without the parent pointer, because it's not possible to do in-order traversal without a parent pointer or a stack. But that's because you can't destroy the tree. An idea could be to do a right rotation at every node from the root down until there is no left child to the current node you are processing. In the end you'd get a tree that's a list---something you'd normally try to use rotations to avoid. Haven't tested this or looked at it any closer, so it might not work.
Jérémie
You seem to describe things that look like depth-first search, and I do not see your algorithm. Finally, the parent pointer means _O(n)_ additional storage, compared with the specification of the problem.
Svante
When I try this with the complete tree of 15 nodes, the result is `8, 4, 9, 2, 10, 5, 11, 1, 12, 6, 13, 3, 14, 7, 15, `.
Svante
So it is in-order (depth-first), not level-order.
Svante
Svante, I'm not sure how you created that complete tree of 15 nodes, but I get 1, 2, 3, ..., 15, and not the result that you gave. But you are absolutely right, I somehow skipped that Merlyn was looking for level-order!!!!!! This is embarrassing, and proves you were right to ask for a specific algorithm :), though I would like to maintain that my first solution was valid, although it required the parent pointer, and an overall additional cost of O(n), as you state.
Jérémie
I numbered the nodes in level-order, of course.
Svante
In your previous comment? 8, 4, 9, ... isn't level order, and you say "when I try this with the complete tree", though it is not what you'd get with my algorithm either.
Jérémie
@Jérémie: After some self-deliberation, I accepted Svante's answer. It solves the problem with constant storage, with a representation that doesn't include parent pointers, which was my biggest requirement. The deliberation between your answers was because of what I remember about the simplicity of the original algorithm. The fact that it mutated and re-threaded the tree makes me believe that it must have had parent pointers.
Merlyn Morgan-Graham
+1  A: 

Well, I can't quite figure right now how this helps in this situation but it might give you a lead. There's a technique called "pointer reversal" used for iteratively traversing a tree without the need to use a stack/queue for storing pointers - it was mainly used for low memory overhead garbage collectors. The idea behind this is that when you traverse to a node's child you link that pointer to the child back to the parent so you know where to go back when you finish with that node. That way the traceback information you would generally keep in a stack/queue is now embedded in the tree itself.

I have found the following slideshow with an example (unfortunately there isn't something better on google). The example here shows how to traverse a binary tree without extra storage.

smichak
@smichak: Though this is not a complete answer, it still is along the lines of what I was looking for. Essentially you re-write the tree's child pointers to suit whatever traversal needs you have, and avoid extra storage overhead.
Merlyn Morgan-Graham
A: 

I don't think we need parent pointers. Suppose inductively that levels 0 through k-1 plus a sentinel node have been converted to a singly-linked list on the left child pointers whose right children point to the nodes of level k. Iterate through the list, grabbing each "right child" (level k node) in turn and inserting it at the end of the list, overwriting the right pointer from which it came with its soon-to-be-overwritten left child. When we reach the initial end of the list, we have extended the inductive hypothesis to k+1.

EDIT: code

#include <cstdio>

struct TreeNode {
  int value;
  TreeNode *left;
  TreeNode *right;
};

// for simplicity, complete binary trees with 2^k - 1 nodes only
void Flatten(TreeNode *root) {
  TreeNode sentinel;
  sentinel.right = root;
  TreeNode *tail = &sentinel;
  while (true) {
    TreeNode *p = &sentinel;
    TreeNode *old_tail = tail;
    while (true) {
      if ((tail->left = p->right) == NULL) {
        return;
      }
      tail = p->right;
      p->right = p->right->left;
      if (p == old_tail) {
        break;
      }
      p = p->left;
    }
  }
}

int main() {
  const int n = 31;
  static TreeNode node[1 + n];
  for (int i = 1; i <= n; ++i) {
    node[i].value = i;
    if (i % 2 == 0) {
      node[i / 2].left = &node[i];
    } else {
      node[i / 2].right = &node[i];
    }
  }
  Flatten(&node[1]);
  for (TreeNode *p = &node[1]; p != NULL; p = p->left) {
    printf("%3d", p->value);
  }
  printf("\n");
}
Note that this does just BFS, not level order like the naive queue solution. My guess is we could get level order by doing an in-place transpose, but I don't care to work out the details.
To clarify what he is saying, the output of this is `1 2 3 4 6 5 7 8 12 10 14 9 13 11 15 16 24 20 28 18 26 22 30 17 25 21 29 19 27 23 31`.
Svante
+5  A: 

The Idea:

You can build the linked list along the left children of the tree. At the same time, the right children of that list are used to hold a queue of the next subtrees to insert at the tail.


The algorithm in pseudo code:

(edit: rewritten for clarity)

  • A node has three components: a value, a reference to its left child, and a reference to its right child. The references may be null.
  • The function to transform a binary tree of such nodes into a single linked list
    • takes a reference to the root node as parameter root,
    • loops, with tail starting from the left child of root:
      • swap the left child of tail with the right child of root,
      • loop (O(n) queue insertion), with bubble-to starting from root and bubble-from always the left child of bubble-to,
        • swap the right child of bubble-to with the right child of ´bubble-from`,
        • advance bubble-to and bubble-from to their left children,
        • until bubble-from reaches tail,
      • advance tail to its left child,
      • while tail is not null.
    • Finally, return head. The single linked list now runs along the left children.

Illustration

The starting tree (I believe that it should be a complete tree; if nodes are missing, they should do so from the end. You could try to give meaning to other cases, but there are several different possibilities for that, and it involves a lot of fiddling):

              1
          /       \
      2               3
    /   \           /   \
  4       5       6       7
 / \     / \     / \     / \
8   9   A   B   C   D   E   F

Note that these are not necessarily the values of the nodes, they are just numbered for display purposes.

The tree after 1 iteration (note the list forming from 1 to 3 and the queue of the subtrees rooted in 4 and 5):

                head
                  v
                  1
               /    \
    tail    2         4
      v  /    \      / \
      3         5   8   9
    /   \      / \
  6       7   A   B
 / \     / \
C   D   E   F

after 3 iterations (the list is now 1 to 5, and the queue holds the subtrees rooted in 6 to 9):

                   head
                     v
                     1
                  /     \
               2          6
             /   \       / \
           3       7    C   D
          / \     / \
         4   8   E   F
        / \
tail > 5   9
      / \
     A   B

and after 8 iterations (almost done):

                       head
                         v
                         1
                        / \
                       2   B
                      / \
                     3   C
                    / \
                   4   D
                  / \
                 5   E
                / \
               6   F
              /
             7
            /
           8
          /
         9
        /
tail > A

Real Code (Lisp)

This is the node class:

(defclass node ()
  ((left :accessor left)
   (right :accessor right)
   (value :accessor value)))

A useful helper:

(defmacro swap (a b)
  `(psetf ,a ,b
          ,b ,a))

The conversion function (edit: rewritten for clarity):

(defun flatten-complete-tree (root)
  (loop for tail = (and root (left root)) then (left tail)
        while tail
        do (swap (left tail) (right root))
           (loop for bubble-to = root then (left bubble-to)
                 for bubble-from = (left bubble-to)
                 until (eq bubble-from tail)
                 do (swap (right bubble-to) (right bubble-from))))
  root)

Irreversible Operation for Jagged Trees:

I have rewritten the above to allow for arbitrary jagged trees. You cannot reconstruct the original tree from this anymore, though.

(defun flatten-tree (root)

;; The inner loop here keeps head at the root of the as-yet unflattened sub-tree,
;; i.e. the node of the first branch,
;; while at the same time ironing all from the root unbranched levels to the left.
;; It ends up nil when the tree is completely flattened.

  (loop for head = (loop for x = (or head root) then (left x)
                         do (when (and x (null (left x)))
                              (swap (left x) (right x)))
                         until (or (null x) (right x))
                         finally (return x))
        for tail = (and head (left head)) then (left tail)
        while head
        do (swap (left tail) (right head))

;; This inner loop is the O(n) queue insertion

           (loop for bubble-to = head then (left bubble-to)
                 for bubble-from = (left bubble-to)
                 until (eq bubble-from tail)
                 do (swap (right bubble-to) (right bubble-from))))

;; Finally, return the original root.

  root)
Svante
@Stante: I love LISP, conceptually. Simple compiler, and fully integrated macro support, etc. But I still can't read it :( I will try to wade through this later today. FYI, the algorithm I had before made no assumptions about input or balancing, and was still very simple. Efficiency isn't necessarily important.
Merlyn Morgan-Graham
@Svante: BFS may make a little less sense for a very jagged tree, but it still has the definition that each depth should come before the next depth. If you feed the list back into the algorithm, it should have the same result.
Merlyn Morgan-Graham
@Merlyn Morgan-Graham: I added pseudo code that corresponds almost line for line to the Lisp code.
Svante
@Svante: Awesome! :) Did a technical interview today, so the LISP would have probably finished off my mind for tonight, anyhow.
Merlyn Morgan-Graham
By keeping a pointer to the root at all times, you are essentially substituting the need for parent pointers by going through the tree from the top down at every iteration. Still, this algorithm is very nice, though it's LISP implementation is not at all flattering (this coming from an SICP fan). Kudos and up-vote :)
Jérémie
I do not need to find the parent of anything, the tree edge traversal is simply necessary for the queue insertion.
Svante
@Svante: Not abandoning the question. Will check out your answer soon
Merlyn Morgan-Graham
@Svante: Does this algorithm require the tree to be balanced? I generated a random unbalanced tree as input (see the values in my response to Rafe), and I am getting invalid results out of this solution.
Merlyn Morgan-Graham
As I have written, I think the tree should be complete (if nodes are missing, they should miss starting at F, E, D ...). If you want incomplete trees, please specify how you want null nodes to be treated. Are they to be skipped? Are they to be inserted, but their (nonexistent) children be skipped? Are they to be inserted at any level where a non-null node exists?
Svante
@Svante: Ah, good point. I've been operating under the assumption that a null node would be skipped, and that we can have very ragged/incomplete trees. The binary tree algorithms I've been working with are very simple: in-order recursive insertion, with duplicates to the right. BFS (marking/printing) done with a queue: `queue root` `while queue is not empty` { `dequeue as current` `queue current's children` `visit current` }
Merlyn Morgan-Graham
OK, but that means that the operation is not inversible. This requires a change in the queue insertion.
Svante
@Svante: My "requirements" are basically just along the lines of my current understanding. I bow to your data structure knowledge, since mine is very basic :) For example, I have little to no experience with tree balancing, node rotation, red/black and AVL. Out of curiosity, is there a way to handle missing nodes and make the operation reversible? How would you handle "[how null nodes should be treated]. Are they to be inserted ..."? What would you use to represent a null node?
Merlyn Morgan-Graham
Haha. You surely overestimate me. If you want to keep track of missing nodes, you need a secondary value to indicate "missing". You could use the right child in the flattened tree for that; however, you would create the missing nodes for the flattened tree, so it would need additional storage.
Svante
I have rewritten the code. It is funny how you seem to be able to code much clearer when entering it into a real editor.
Svante
@Svante: What is this "community wiki" business? I understand it for a question, but not for an answer. Since I didn't see it until just now, I am guessing it wasn't set manually. Does that somehow limit the number of points you can receive?
Merlyn Morgan-Graham
@Merlyn Morgan-Graham: I don't understand it either, I guess that it is an automatic thing because I edited so often.
Svante
Yes, I looked it up. It is a misinformed and ill-directed attempt to prevent bumping questions by editing one's own answer (you are supposed to bump them by adding comments or further answers, apparently). Another reason not to like Stackoverflow.
Svante
+2  A: 

Just for the record, the recursive version is beautiful (this is in C#):

[Edited: as st0le points out, my first version contains 'new's! Forgive me, I've spent the last twenty years coding in declarative languages. Here is the corrected version.]

[Edit: double rats. This isn't breadth first.]

public static Tree<T> Flatten(Tree<T> t, Tree<T> u = null)
{
    if (t == null) return u;
    t.R = Flatten(t.L, Flatten(t.R, u));
    t.L = null;
    return t;
}
Rafe
beautiful indeed! :) although i don't like the fact that the snippet creates `new` nodes... :(
st0le
Could you break that line up a little?
Svante
@Rafe: I think this is the most basic algorithm that provides a strong use case for recursion. I find most other simple recursion examples to be a cute way for professors and nerds to flex their math muscles at the reader.
Merlyn Morgan-Graham
Well, as a (very recently) ex-academic computer scientist, I'm going to go out on a limb and say that *recursion* is fundamental, it's *iteration* that's harder to think about :-)That said, I think you're probably right that many teachers don't teach functional programming well, but I think students have to shoulder some of the blame, too. I would put money on you not wanting to go back to the mainstream once you spend a month seriously coming to grips with functional programming (F# would be a good place to start).
Rafe
A: 

This is my answer which works. I now realise this is the same approach as Svante's solution(!), although I build the tree to the right.

For the record, here is the C# code:

// Flatten a tree into place in BFS order using O(1) space and O(n) time.
// Here is an example of the transformation (the top-row indicates the
// flattened parts of the tree.
//  
//  a
//  |---.
//  b   c
//  |-. |-.
//  d e f g
//  
//  becomes
//  
//  a-b-c
//  | | |-.
//  d e f g
//  
//  becomes
//  
//  a-b-c-d-e-f-g
//  
public static void FlattenBFS(Tree<T> t)
{
    var a = t; // We "append" newly flattened vertices after 'a'.
    var done = (t == null);
    while (!done)
    {
        done = true;
        var z = a; // This is the last vertex in the flattened part of the tree.
        var i = t;
        while (true)
        {
            if (i.L != null)
            {
                var iL = i.L;
                var iLL = iL.L;
                var iLR = iL.R;
                var aR = a.R;
                i.L = iLL;
                a.R = iL;
                iL.L = iLR;
                iL.R = aR;
                a = iL;
                done &= (iLL == null) & (iLR == null);
            }
            if (i == z)
            {
                break; // We've flattened this level of the tree.
            }
            i = i.R;
        }
        a = (a.R ?? a); // The a.R item should also be considered flattened.
    }
}
Rafe
if you insert these values into the tree (in order) - `11928` `9507` `12589` `8202` `11597` `24561` `5453` `8754` `13838` `32508` `1332` `6367` `8851` `12720` `18234` `26462` `5311` `7956` `12607` `13721` `14799` `20420` `25563` `26884` `1769` `7065` `8127` `12857` `13725` `14773` `16861` `19935` `21343` `25145` `27775` `7060` `7502` `18366` `21101` `23423` `27669` `31597` `18720` `21687` `30049` `31971` `18668` `19273` `32229` 32197`
Merlyn Morgan-Graham
index them breadth first, flatten with your method, and print the indicies, you will get this output - `0` `1` `2` `3` `4` `5` `6` `7` `8` `9` `10` `13` `11` `12` `14` `15` `18` `20` `22` `16` `19` `17` `21` `23` `29` `33` `24` `27` `25` `31` `30` `28` `26` `32` `34` `35` `37` `38` `40` `36` `39` `41` `43` `44` `42` `45` `46` `47` `48` `49`
Merlyn Morgan-Graham
I'm don't understand why one would expect the flattened indices to correspond to the order of insertion. If I insert {2, 3, 1} into an ordered binary tree, then flatten it, I expect to see {2, 1, 3} as the result.Have I misunderstood you?
Rafe