views:

234

answers:

4

Edited for a larger tree, for more examples.

I have a tree structure that I need to generate all of the possible permutations of, with some restrictions. Given a tree like this:

    A1----(B1, B2)
     \    
      \___(C1)
             \__(E1, E2)
/       
-  A2----(B3, B4)
\     \     \
       \     \__(D1)
        \
         \_(F1, F2)
                |
                (D4)   

    A3----(B5, B6)
                \__(D2, D3)

Or if this is a little vague here's the same structure done with a Perl notation:

my $nodes = [
{
    name => 'A1',
    children => [
        [ { name => 'B1', children => []  }, 
          { name => 'B2', children => []  }
        ],
        [
          { name => 'C1', 
                    children => [
                        [
                            { name => 'E1', children => [] },
                            { name => 'E2', children => [] }
                        ]
                    ]
            }
        ]
    ]
},
{
    name => 'A2',
    children => [
        [ { name => 'B3', 
                children => [
                    [
                        { name => 'D1', children => [] }
                    ]
                ] 
          },
          { name => 'B4', children => [] }
        ],
        [
          { name => 'F1', children => [] },
          { name => 'F2', children => [
                        [ { name => 'D4', children => [] } ]
                    ]
            }
        ]
    ]
},
{
    name => 'A3',
    children => [
        [ { name => 'B5', children => [] },
          { name => 'B6', children => [
                    [ { name => 'D2', children => [] },
                      { name => 'D3', children => [] }
                    ] 
                ]                 
            }
        ]
    ]
}

];

(Frankly, if you can figure this out in readable Perl I'll take that too.)

I'm looking to traverse the tree and retrieve all of the possible "paths" from the top level downward. All of the descendant groups of a node have to be represented by exactly 1 member in the "path". For example, in A1 one of (B1, B2) and one of (C1) needs to be represented. So each path descending from A1 will begin with either:

A1 B1 C1

or

A1 B2 C1

If B1, B2, or C1 have children, those will need to be represented as well.

Working this out by hand for the above tree I get these possibilities:

A1 B1 C1 E1
A1 B1 C1 E2
A1 B2 C1 E1
A1 B2 C1 E2

A2 B3 D1 F1
A2 B3 D1 F2 D4
A2 B4 F1
A2 B4 F2 D4

A3 B5
A3 B6 D2
A3 B6 D3

Each node here is a DataRow object:

internal class DataRow
{
    internal string tag = "";
    internal int id = 0;
    internal Dictionary<string, List<DataRow>> children = null;

    internal DataRow(int id, string tagname)
    {
        this.tag = tagname;
        this.id = id;
    }        internal void AddChildren(string type, List<DataRow> drList)
    {
        if (children == null)
            children = new Dictionary<string, List<DataRow>>();
        if (!children.ContainsKey(type))
            children[type] = new List<DataRow>();
        children[type].AddRange(drList);
    }
    internal void AddChild(string type, DataRow dr)
    {
        List<DataRow> drList = new List<DataRow>();
        drList.Add(dr);
        AddChildren(type, drList);
    }
    public override string ToString()
    {
        return this.tag + " " + this.id;
    }
}

To build the sample tree above (except for the E and F levels, added later):

        DataRow fullList = new DataRow(null, "");
        DataRow dr, dr2, dr3;

        // First node above
        dr = new DataRow(1, "A");
        List<DataRow> drList = new List<DataRow>();
        drList.Add(new DataRow(1, "B"));
        drList.Add(new DataRow(2, "B"));
        dr.AddChildren("B", drList);
        drList.Clear();
        dr2 = new DataRow(1, "C");
        dr2.AddChild("C", new DataRow(1, "E"));
        dr2.AddChild("C", new DataRow(2, "E"));
        drList.Add(dr2);
        dr.AddChildren("C", drList);
        fullList.AddChild("A", dr);


        // Second Node above (without the "F" stuff)
        drList.Clear();
        dr = new DataRow(3, "B");
        dr.AddChild("D", new DataRow(1, "D"));
        drList.Add(dr);
        drList.Add(new DataRow(4, "B"));
        dr = new DataRow(2, "A");
        dr.AddChildren("B", drList);
        fullList.AddChild("A", dr);

        // Third node above
        drList.Clear();
        dr3 = new DataRow(6, "B");
        dr3.AddChild("B", new DataRow(2, "D"));
        dr3.AddChild("B", new DataRow(3, "D"));
        dr2 = new DataRow(3, "A");
        dr2.AddChild("B", new DataRow(5, "B"));
        dr2.AddChild("B", dr3);
        fullList.AddChild("A", dr2);

Walking the entire tree is trivial:

    internal void PermutedList()
    {
        if ( children == null ) return;
        foreach (string kidType in children.Keys)
        {
            foreach (DataRow dr in children[kidType])
            {
                dr.PermutedList();
            }
        }
    }

But that's not what I need. This problem is a full tree walk, but in a specific order. What am I not getting? What kind of walk is this?

I've got a messy & slow implementation of this I wrote in Perl 10 years ago, but I can't decipher my own code anymore (shame on me!).

Edit: The graph and the lists below have been expanded, the code has not.

If I could describe the graph, I could program it. If I knew what it was called, I could look it up. But I can't. So let me explain a little further.

The bucket names aren't significant!

Each node has "buckets of children". A1 has two buckets one containing "B" and another containing "C". If that's all it had (and C didn't have buckets under it) I would have "A1 B1 C1" and "A1 B2 C1" -- at least one representative from all of the child buckets present.

So I think each bucket needs the cross-product of its children (all the way down).

+1  A: 

Each node should know its parent (GetParentRow) so you could pass the parent as an argument to the recursive method. This way, when you reach a 'leaf' you can recursively track back to the root.

I'm not sure if its the most efficient way, but I think it should give you the results you want.

MattK
A: 

First I thought you wanted all spanning trees. http://en.wikipedia.org/wiki/Spanning_tree

But then I realized what you wanted was sort of a 'spanning walk' from the root of a tree.

Then I realized it was(relatively) simple.

Foreach leaf

Walk from the leaf to the root

end for

You will need a true datastructure of course, I don't think Perl's hash of hashes will work; you need a 'parent' pointer in each node.

Paul Nathan
The problem is that this will only get you back to the root up through a single leaf and it's branches back. In the example, if you start with node "E1" and walk back to the root you get E1, C1, A1 -- this ignores the other top-level leaf (B1, B2) completely. You may be on to something though.If I take the cross-product of each "leaf-cluster" under the root node and walk backwards to the root I get that path. However, that means "walking" the tree twice. Once to find the leaves, once to track back. Sounds like it would work, but clunky.
clintp
On second thought, strike that. I don't know what I was thinking. Some "leaves" are only partially leaves (B3, B4).
clintp
+1  A: 

A tree walk can be performed in any order desired as follows. For the root node, place all the children into a DATA_STRUCTURE (described below). Then take a node out of the DATA_STRUCTURE and put all of its children into the DATA_STRUCTURE. Continue until DATA_STRUCTURE is empty.

The trick is to pick the right DATA_STRUCTURE. For an in-order (depth-first) traversal, a Stack may be used. (The children must be pushed onto the stack in reverse order.) To do a depth-first traversal, a Queue may be used.

For more complicated orderings, a Priority Queue is the ticket. Just set the priorities according to the order in which you wish to traverse the tree based on whatever criteria you are using. In fact, setting the priorities correctly will also behave as a stack or a queue as well, causing the afore-mentioned depth-first and breadth-first sequences, respectively.

Edited to Add:

The tree walking algorithm will work for a data structure of this type just fine, since it has no cycles. Just put a new node in the data structure for each item in the set. I guess the only thing extra is a way to represent the path.

The path is pretty easy. You just need something like this:

class Path<T>
{
    T lastNode;
    Path<T> previousNodes;
    public Path(T lastNode, Path<T> previousNodes)
    {
        this.lastNode = lastNode; this.previousNodes = previousNodes;
    }
    public IEnumerable<T> AllNodes()
    {
        return AllNodesBackwards().Reverse();
    }
    private IEnumerable<T> AllNodesBackwards()
    {
        Path<T> currentPath = this;
        while (currentPath != null)
        {
            yield return currentPath.lastNode;
            currentPath = currentPath.previousNodes;
        }
    }
    public T Node { get { return lastNode; } }
}

So then all we have to do is walk the path. Something similar to this ought to do the trick:

[ Incorrect solution deleted ]

Edited again:

OK, I finally figured out what you want. You want to travel horizontally across children of different "kidTypes" in each path before traveling down the tree.

It's a big mess, but I solved it:

public void WalkPath2()
{
    Queue<Path<DataRow>> queue = new Queue<Path<DataRow>>();
    queue.Enqueue(new Path<DataRow>(this, null));

    while (queue.Count > 0)
    {
        Path<DataRow> currentPath = queue.Dequeue();
        DataRow currentNode = currentPath.Node;

        if (currentNode.children != null)
        {
            foreach (Path<DataRow> nextPath in currentNode.GetChildPermutations(currentPath))
                queue.Enqueue(nextPath);
        }
        else
        {
            foreach (DataRow node in currentPath.AllNodes())
            {
                Console.Write(node.ToString());
                Console.Write("      ");
            }
            Console.WriteLine();
        }

    }

}

private IEnumerable<Path<DataRow>> GetChildPermutations(Path<DataRow> currentPath)
{
    string firstLevelKidType = null;
    foreach (string kidType in children.Keys)
    {
        firstLevelKidType = kidType;
        break;
    }
    foreach (Path<DataRow> pathPermutation in GetNextLevelPermutations(currentPath, firstLevelKidType))
        yield return pathPermutation;
}

private IEnumerable<Path<DataRow>> GetNextLevelPermutations(Path<DataRow> currentPath, string thisLevelKidType)
{
    string nextLevelKidType = null;
    bool nextKidTypeIsTheOne = false;
    foreach (string kidType in children.Keys)
    {
        if (kidType == thisLevelKidType)
            nextKidTypeIsTheOne = true;
        else
        {
            if (nextKidTypeIsTheOne)
            {
                nextLevelKidType = kidType;
                break;
            }
        }
    }

    foreach (DataRow node in children[thisLevelKidType])
    {
        Path<DataRow> nextLevelPath = new Path<DataRow>(node, currentPath);
        if (nextLevelKidType != null)
        {
            foreach (Path<DataRow> pathPermutation in GetNextLevelPermutations(nextLevelPath, nextLevelKidType))
                yield return pathPermutation;
        }
        else
        {
            yield return new Path<DataRow>(node, currentPath);
        }
    }


}
Jeffrey L Whitledge
A little simplistic. The problem isn't just walking the tree -- tree walking algorithms are a dime a dozen. The problem is that the branches are sets themselves (and not nodes) and the cross-product of them needs to be traversed.
clintp
OK. I did some of the work for you. Yeah, I didn't necessarily need to code up the linked-list Path, but I didn't think copying the whole path for each node would be very effecient.
Jeffrey L Whitledge
currentNode.Node? What is this intended to be?
clintp
I forgot a line of code in the Path class. It's fixed now.
Jeffrey L Whitledge
This code is having the same problem as mine. It's traversals go: A1 B1, A1 B2, A1 C1 E1, A1 C1 E2 which is still a straightforward tree walk. (To see this, change the "// Or whatever" line to a Console.Write and then add a Console.WriteLine after the foreach loop). I will keep looking at it though.
clintp
I think I see the problem. The DataRow object associates a list of children with each string, but it sounds like what you really need is a list of strings and a list of children each stored separately on the same DataRow. It could probably be made to work as-is, though.
Jeffrey L Whitledge
OK, I fixed it in the answer.
Jeffrey L Whitledge
Still not working, and I'm not sure how it can. Also a `if currentPath.Node.Value.children != null` is needed before the foreach in the middle of walkpath. In a couple of hours I should be able to offer a bounty on this question too.
clintp
OK. It's fixed again. I think it's what you want this time.
Jeffrey L Whitledge
I think this is close enough. In reality the "kid types" aren't there, it's just a parent-child relationship and nothing is really "keyed" or "typed" in any way, but it's a fair description of what I need to do.
clintp
+1  A: 

Use the following sub:

use 5.10.0;  # for // (aka defined-or)

use subs 'enumerate';
sub enumerate {
  my($root) = @_;

  my $kids = $root->{children};
  return [ $root->{name} // () ]
    unless @$kids;

  my @results;
  foreach my $node (@{ $kids->[0] }) {
    my @fronts = map [ $root->{name} // (), @$_ ],
                     enumerate $node;

    my @below = enumerate {
      name => undef,
      children => [ @{$kids}[1 .. $#$kids ] ],
    };

    foreach my $a (@fronts) {
      foreach my $b (@below) {
        push @results => [ @$a, @$b ];
      }
    }
  }

  @results;
}

You might print it with

foreach my $tree (@$nodes) {
  foreach my $path (enumerate $tree) {
    print "@$path\n";
  }

  print "\n";
}

to get the following output:

A1 B1 C1 E1
A1 B1 C1 E2
A1 B2 C1 E1
A1 B2 C1 E2

A2 B3 D1 F1
A2 B3 D1 F2 D4
A2 B4 F1
A2 B4 F2 D4

A3 B5
A3 B6 D2
A3 B6 D3

I used $path above, but it will seriously confuse anyone who maintains your code because path has a well-understood meaning. You could skirt the naming issue entirely with a bit of cleverness:

print join "\n" =>
      map { join "" => map "@$_\n", @$_ }
      map [ enumerate($_) ] => @$nodes;
Greg Bacon
Nicely done. I was going along these lines but couldn't get mine to work properly. I think the key difference is the way that you walk the tree: enumerate() over*both the children **and** over the node itself, but with "undef" as the name (so as not to repeat it). The nested foreach over both sets of results gives the "cross-product" of the children the way the OP was asking for.
Adam Bellaire
This is fairly awesome. I just need to grok it, port it back to C# and fudge about the "name" a bit (it's not in the original C# structures, in fact they don't have an identifier at all)... but, cool.
clintp
I'm glad it helps!
Greg Bacon
And, BTW, howdy Greg. Long time, no see. :)
clintp
Good to hear from you again! I wondered whether you were the same clintp. Get your autobiographer badge already! :-)
Greg Bacon