tags:

views:

584

answers:

6

I want to have a tree in memory, where each node can have multiple children. I would also need to reference this tree as a flat structure by index. for example:

    a1
       b1
       b2
       b3
    c1
       d1
          e1
          e2
       d2
          f1

Would be represented as a flat structure as I laid out (i.e.; a1=0, b1=1, d1=5, etc..)

Ideally I would want lookup by index to be O(1), and support insert, add, remove, etc.. with a bonus of it being threadsafe, but if that is not possible, let me know.

A: 

If you have a defined number of children for each tree (for instance, a Binary Tree), then it's not too difficult (although you potentially waste a lot of space).

If it has a variable number of children, you'd probably have to come up with some convoluted manner of storing the index of a node's first child.

I'm not seeing how it would be useful to do this, though. The point of trees is to be able to store and retrieve items below a particular node. If you want constant-time look-up by index, it doesn't sound like you want a tree at all. By storing it in an array, you have to consider the fact that if you add an element into the middle, all of the indexes you had originally stored will be invalid.

However, if indeed you want a tree, and you still want Constant Time insertion/lookup, just store a reference to the parent node in a variable, and then insert the child below it. That is constant time.

Smashery
No defined number of children :(
esac
+2  A: 

If you have a reasonably balanced tree, you can get indexed references in O(log n) time - just store in each node a count of the number of nodes under it, and update the counts along the path to a modified leaf when you do inserts, deletions, etc. Then you can compute an indexed access by looking at the node counts on each child when you descend from the root. How important is it to you that indexed references be O(1) instead of O(log n)?

If modifications are infrequent with respect to accesses, you could compute a side vector of pointers to nodes when you are finished with a set of modifications, by doing a tree traversal. Then you could get O(1) access to individual nodes by referencing the side vector, until the next time you modify the tree. The cost is that you have to do an O(n) tree traversal after doing modifications before you can get back to O(1) node lookups. Is your access pattern such that this would be a good tradeoff for you?

dewtell
My first suggestion is generalized in Mark Chu-Carrol's article on finger trees: http://scienceblogs.com/goodmath/2009/05/finally_finger_trees.php
dewtell
A: 

This is possible with a little work, but your insert and remove methods will become much more costly. To keep the array properly ordered, you will need to shift large chunks of data to create or fill space. The only apparent advantage is very fast traversal (minimal cache misses).

Anyhow, one solution is to store the number of children in each node, like so:

struct TreeNode
{
    int numChildren;
    /* whatever data you like */
};

Here's an example of how to traverse the tree...

TreeNode* example(TreeNode* p)
{
    /* do something interesting with p */

    int numChildren = p->numChildren;
    ++p;
    for(int child = 1; child <= numChildren; ++child)
       p = example(p);
    return p;
}

Hopefully you can derive insert, remove, etc... on your own.

:)

Evan Rogers
+1  A: 

I use something similar to this in a Generic Red-Black tree I use. Essentially to start you need a wrapper class like Tree, which contains the actual nodes.

This is based on being able to reference the tree by index

So you can do something like the following to set up a tree with a Key, Value

class Tree<K, V>
{
    //constructors and any methods you need

    //Access the Tree like an array
    public V this[K key]
    {
        get {
            //This works just like a getter or setter
            return SearchForValue(key);
        }
        set {
            //like a setter, you can use value for the value given
            if(SearchForValue(key) == null)
            {
                // node for index doesn't exist, add it
                AddValue(key, value);
            } else { /* node at index already exists... do something */ }
     }
}

This works on the assumption that you already know how to create a tree, but want to to able to do stuff like access the tree by index. Now you can do something like so:

Tree<string,string> t = new Tree<string,string>();
t["a"] = "Hello World";
t["b"] = "Something else";
Console.Writeline("t at a is: {0}", t["a"]);

Finally, for thread saftety, you can add an object to you're Tree class and on any method exposed to the outside world simply call

Lock(threadsafetyobject) { /*Code you're protecting */ }

Finally, if you want something cooler for threadsafety, I use an object in my tree call a ReaderWriterLockSlim that allows multiple reads, but locks down when you want to do a write, which is especially importantif you're changing the tree's structure like doing a rotation whilst another thread is trying to do a read.

One last thing, i rewrote the code to do this from memory, so it may not compile, but it should be close :)

Kevin Nisbet
A: 

You could always look at using Joe Celko's technique of using Nested Sets to represent trees. He's Sql focused, but the parallels are there between nested sets and representing a tree as a flat array, and it may be useful for your ultimate reason for wanting to use an array in the first place.

As others note though, most of the time it's easier just to traverse the tree directly as linked nodes. The oft-cited array implementation of a tree is a Binary Search Tree because for a node n, the parent is (n-1)/2, the left child is 2n+1 and the right child is 2n+2

The downside of using arrays are insertions, deletions, pruning and grafting all (usually) require the array to be modified when the tree changes.

You could also read up on B-trees

Robert Paulson
A: 

Not sure if this is good. But a flat array can be addressed as a binary tree by calculating the tree level as a power and then adding the offset. But this only works if the tree is a binary one.

sybreon