tags:

views:

189

answers:

2

I'm using LINQ to build up a tree structure of objects from a collection of objects which were retrieved from a stored procedure call.

I want to know if
a) there is any way to remove elements from one collection to a new collection
b) if there is actually any point performance wise in doing this

my code looks something like this:

class MyEntity
{
    int ID { get; set; }
    int? ParentID { get; set; }
    string Name { get; set; }
    List<MyEntity> children = new List<MyEntity>();
    List<MyEntity> Children { get { return children; } }
}

List<MyEntity> initialCollection = //get stuff from DB

List<MyEntity> rootElements = (from e in initialCollection
                               where e.ParentID == null
                               select e).ToList();

List<MyEntity> childElements = (from e in initialCollection
                                where e.ParentID != null
                                select e).ToList();

foreach(MyElement e in rootElements)
    e.Children.AddRange((from c in childElements
                        where c.ParentID == e.ID
                        select c).ToList());
//do some more recursion

So basically; is there a way to do the select statement, where I actually remove the elements from initialCollection as I select them. The idea being to reduce the number of elements to search through while recursively building up my tree. Would there actually be any benefit in doing this, or is the overhead of removing elements from one collection and adding to another too great?

+3  A: 

A much better idea would be to create a lookup:

var childElements = initialCollection.Where(e => e.ParentID != null)
                                     .ToLookup(e => e.ParentID);

foreach (MyElement e in rootElements)
{
    if (childElements.Contains(e.ID))
    {
        e.Children.AddRange(childElements[e.ID]);
    }
}

A lookup is a bit like a Dictionary<TKey, IEnumerable<TValue>> - so basically you work out which children belong to which parent, then add them all in.

I think the call to Contains is necessary in case you have any root elements with no children - I would expect the indexer to throw an exception if the specified key doesn't exist. The docs aren't very clear though - it may return an empty sequence instead.

Jon Skeet
Brilliant, a really nice clean solution there. Cheers.
mdresser
+2  A: 

a) I don't think you can do this for a couple of reasons. Firstly, the linq operators are used to evaluate expressions and so don't affect the source collection(s) (or have any other side-effects). Secondly, iterators don't allow the source collection to be modified during iteration so there isn't any opportunity to remove elements during the select anyway.

b) It's unlikely this would give any performance benefit - removing items from a List is O(n) so removing m items would be O(mn) which could be quite slow if there are a lot of items to remove. Unless you're really pushed for space it's better just to create a copy and use that, and in that case you'd want a different data structure anyway.

Lee