views:

1442

answers:

3

I have a list of items in a hierarchy, and I'm attempting to parse this list out into an actual hierarchy of objects. I'm using modified pre-order tree traversal to store/iterate through this list, and so what I have is a subset of the tree, including all children, ordered by their "left" value.

For example, given the tree:

  • Item A
    • Item A.1
    • Item A.2
      • Item A.2.2
  • Item B
    • Item B.1
  • Item C

I get the list:

  • Item A, Item A.1, Item A.2, Item A.2.2, Item B, Item B.1, Item C

(This is in order of the "left" value from the modified pre-order tree setup).

What I want to do is parse this into objects that contain the actual structure of the tree, eg:

Class TreeObject {
    String Name;
    Guid ID; 
    Guid ParentID;
    List<TreeObject> Children;
}

The flat list is returned as a List of TreeObjects - and each TreeObject has properties for ID, ParentID, Left, and Right. What I'm looking for is a function:

List<TreeObject> FlatToHeirarchy(List<TreeObject> list);

which takes the flat list in, and returns a nested list.

In other words:

List<TreeObject> flatSet = LoadTreeObjectsFromDatabase(); 
// flatSet.count == 7; flatSet(0).Children == null
List<TreeObject> nestedSet = FlatToHeirarchy(flatSet);
// nestedSet.count == 3; nestedSet(0).Children.count == 2

I'm at a loss how to do this - keeping track of parents, and being able to deal with a bigger jump (eg, Item A.2.2 -> Item B).


Edit: I'm looking for a non-brute-force solution here (eg, not looping several times, moving items into child nodes, until there are only the top-level parents left). I'm guessing there is an elegant method that can loop once, and just place items as needed.

Remember, they are always in a hierarchal order (since I'm using MPTT), so a given item is going to always be a child or sibling of the previous item, or at least share a parent with the previous item. It is never going to come somewhere else in the tree.

+1  A: 

I assume you already know the parent of all items.

All you need to do is to iterate through all item of the list once and add the current item to its parent's children list. Only keep the items without parents in the target nested list.

Here is some pseudo code:

foreach Item item in flatlist
   if item.Parent != null
      Add item to item.Parent.ChildrenList
      Remove item from flatlist
   end if
end for

As for getting the parents, from what I can see in your example, you might need to parse the name and build a stack as you advance in the list.

This problems looks hard but it really isn't. A lot of people see this problem from the wrong angle; you must not try to populate every children list but rather get rid of the children items from the flat list, then it becomes easy.

Coincoin
That's the part I'm not getting. I'm guessing there is a nice way to have a stack that contains all the parents, and keep pushing into it as an item is deeper than its parent, or popping the last one out as you go up. What I want to avoid is looping over the list a bunch of times removing items.
gregmac
Once you know every node's parent, you can go through the list once. I updated my answer with some pseudo code.
Coincoin
I know the ID of the parent, but I don't have a reference to the parent object itself (without searching through the list, of course).
gregmac
A: 

Here is an example, hope this helps

class Program
{
    static void Main(string[] args)
    {
        TreeObject a  = new TreeObject() { Name = "Item A" };
        a.Children.Add( new TreeObject() { Name = "Item A.1" });
        a.Children.Add( new TreeObject() { Name = "Item A.2" });

        TreeObject b = new TreeObject() { Name = "Item B" };
        b.Children.Add(new TreeObject() { Name = "Item B.1" });
        b.Children.Add(new TreeObject() { Name = "Item B.2" });

        TreeObject c = new TreeObject() { Name = "Item C" };

        List<TreeObject> nodes = new List<TreeObject>(new[] { a, b, c });

        string list = BuildList(nodes);
        Console.WriteLine(list); // Item A,Item A.1,Item A.2,Item B,Item B.1,Item B.2,Item C

        List<TreeObject> newlist = new List<TreeObject>();
        TreeObject temp = null;

        foreach (string s in list.Split(','))
        {
            if (temp == null || !s.Contains(temp.Name) || temp.Name.Length != s.Length)
            {
                temp = new TreeObject() { Name = s };
                newlist.Add(temp);
            }
            else
            {
                temp.Children.Add(new TreeObject() { Name = s });
            }                              
        }

        Console.WriteLine(BuildList(newlist)); // Item A,Item A.1,Item A.2,Item B,Item B.1,Item B.2,Item C
    }

    static string BuildList(List<TreeObject> nodes)
    {
        StringBuilder output = new StringBuilder();
        BuildList(output, nodes);
        return output.Remove(output.Length - 1, 1).ToString();
    }

    static void BuildList(StringBuilder output, List<TreeObject> nodes)
    {
        foreach (var node in nodes)
        {
            output.AppendFormat("{0},", node.Name);
            BuildList(output, node.Children);
        }
    }
}

public class TreeObject
{
    private List<TreeObject> _children = new List<TreeObject>();

    public string Name { get; set; }
    public Guid Id { get; set; }
    public List<TreeObject> Children { get { return _children; } }
}

}

Rohan West
This is almost the opposite of what I want to do. What I want to end up with is the structure in the 'nodes' variable you show on line 15. I start with a List<TreeObject> with ALL objects directly in it, as siblings (eg, no parent/child relationships are present).
gregmac
+2  A: 

Here's the function I ended up writing. I'm using MPTT to store objects, so the list is in order of the 'left' value, which basically means the parent always comes before any given item in the list. In other words, the item referenced by item.ParentID has always already been added (except in the case of top-level or root nodes).

List<TreeObject> FlatToHierarchy(List<TreeObject> list) {
  // hashtable lookup that allows us to grab references to the parent containers, based on id
  Dictionary<Guid, TreeObject> lookup = new Dictionary<Guid, List<TreeObject>>;
  // actual nested collection to return
  List<TreeObject> nested = new List<TreeObject>;

  foreach(TreeObject item in list) {
    if (lookup.ContainsKey(item.ParentID)) {
      // add to the parent's child list 
      lookup(item.ParentID).Item(item.ParentID).Add(item);
      lookup.Add(item.ID, lookup(item.ParentID));
    } else {
      // no parent added yet (or this is the first time)
      nested.Add(item);
      lookup.Add(item.ID, nested);
    }
  }

  return nested;
}
gregmac
What's the deal with this line: lookup(item.ParentID).Item(item.ParentID).Add(item); ? Is this a typo? You can only access the item by index at that level, no?
ScottE