tags:

views:

58

answers:

2

I have a data table that contains a one-to-many self referential relationship:

Plant
{ 
  ID,
  Name,
  ParentID
}

I'm trying to create a linq query that will tell me the total number of descendants stemming from one Plant.

Example:

I have 5 plants:

One {ID = 1, Name = Pine, Parent = null};// This is the root
    Two {ID = 2, Name = Evergreen, Parent = 1}
        Three {ID = 3, Name = Winter Evergreen, Parent = 2}
    Four {ID = 4, Name = Christmas Tree, Parent = 1}
Five {ID = 5, Name = Maple, Parent = null};// This is the root

When I call my LINQ query with an input of ID = 1, I want it to return 3, because there are 3 descendants of One; Two, Three and Four. Five is not a decedent of One.

The only way I can think about doing this involves nested recursive linq queries on the inner results. I have no idea how to do this and I feel as though there is an easier way.

I'm using C# 4.0 and LINQ if that matters.

A: 

EDIT adding code that supports LINQ to SQL thanks to @Kirk Woll comment.

class Program
{
    private static Table<Plant> plants;

    static void Main(string[] args)
    {
        using (var dataContext = new DataClasses1DataContext())
        {
            plants = dataContext.Plants;
            var treesNodes = GetTreesNodes(plants.Where(plant => plant.ID == 1));
            Console.WriteLine(string.Join(",",
                                          treesNodes.Select(plant => plant.ID)));
        }
    }

    public static IEnumerable<Plant> GetTreesNodes(IEnumerable<Plant> roots)
    {
        if(!roots.Any())
        {
            return roots;
        }

        var children = roots.SelectMany(root => 
                                    plants.Where(plant => plant.Parent == root));
        return roots.Union(GetTreesNodes(children));
    }

}

Previous version that match LINQ to Objects:

This method can provide all the nodes in the tree:

public IEnumerable<Plant> GetTreesNodes(IEnumerable<Plant> roots)
{
    if(!roots.Any())
    {
        return Enumerable.Empty<Plant>();
    }

    var rootsIds = roots.Select(root => root.ID);
    var children = plants.Where(plant => rootsIds.Contains(plant.Parent));
    return roots.Union(GetTreesNodes(children));
}

It can be used as in the following example:

[Test]
public void ExampleTest()
{
    var One = new Plant() {ID = 1, Name = "Pine", Parent = 0}; 
    var Two = new Plant() {ID = 2, Name = "Evergreen", Parent = 1};
    var Three = new Plant() {ID = 3, Name = "Winter Evergreen", Parent = 2};
    var Four = new Plant() {ID = 4, Name = "Christmas Tree", Parent = 1};
    var Five = new Plant() {ID = 5, Name = "Maple", Parent = 0};

    plants = new []{One,Two,Three,Four,Five};

    var nestedElements = GetTreesNodes(new[]{One});
    var numOfNestedElementsWithoutRoot = nestedElements.Count()-1;

    Assert.That(numOfNestedElementsWithoutRoot, Is.EqualTo(3));
}

The code assumes there are no cyclic references.

Elisha
She mentions "data table" and "one to many relationship" so I think she needs something that will translate into SQL. And this won't do that. In other words, she needs something that will work with IQueryable.
Kirk Woll
@Kirk Woll, you have a point there. If it's a loaded data table then LinqToObjects will probably solve it (even though there may be more efficient solutions). If it's a "live" table where it'll need to translate to SQL then the code might not work.
Elisha
@Kirk Woll, on second thought, I can't find the part which will have difficulty translating to SQL. Can you point the problematic code? There's a good chance you're right, but I can't find the problem myself :)
Elisha
@Kirk Woll, you're right, the original code failed to translate to SQL. Updated the code to a version that translates to SQL.
Elisha
Thanks! I'm trying to get your edit to work, but I'm having trouble with the `plants` part in the var children assignment. Should it be `roots`? I see it references the static Table<Plant>, but I get an error declaring a Table<>.
Erica
@Elisha, your changes are functional, but it should be noted that if the number of elements returned is, say, 25, then **26 separate SELECT statements** will have to be run. Therefore, the CTE solution I mention should be used if performance becomes an issue.
Kirk Woll
@Kirk: I'll probably end up using your solution eventually for the very reason of performance. I'm more or less just trying to get my little project working and better understand how linq works. Then I can improve performance.
Erica
@Erica, plants is a field on the class (it's stored as field in order to avoid passing it as argument to *GetTreesNodes*)
Elisha
Thanks Elisha. I actually got your method to work. I ended up using the SQL query because its better performance wise.
Erica
A: 

If you were using SQL Server, the query you would want constructed would be:

DECLARE @TargetElementId int
SET @TargetElementId = 1;

WITH Plants AS
(
    SELECT ID, Name, ParentId
    FROM PlantsTable
    WHERE ParentId = @TargetElementId

    UNION ALL

    SELECT ID, Name, ParentId
    FROM PlantsTable
    INNER JOIN Plants ON PlantsTable.ParentId = Plants.ID
)
SELECT COUNT(ID) - 1 FROM Plants

I don't believe this can be done with LinqToSql, see http://stackoverflow.com/questions/584841/common-table-expression-cte-in-linq-to-sql, which is a question of a similar nature.

Kirk Woll
This is excellent and faster than the linq solution. I'll end up using this. I just need to brush off my SQLConnection stuff in .Net. ExecuteScalar() here I come!
Erica
This is what I ended up implementing. It works. I think I'll probably have to modify it so that the Parameter is actual read from another query, but that's for another day.
Erica