views:

111

answers:

3

Given a self referencing table

Item 
-------------
Id (pk)
ParentId (fk)

With a related table of associated values

ItemValue
-------------
ItemId (fk)
Amount

And some sample data

Item                       ItemValues 
Id      ParentId           ItemId      Amount
--------------------       ----------------------
1       null               1           10
2       1                  3           40
3       1                  3           20
4       2                  4           10
5       2                  5           30
6       null
7       6
8       7

I need a sproc to take Item.Id and return the direct children with sums of all ItemValue.Amounts for the them, their children and their children all the way down the tree.

For example, if 1 is passed in, the tree would be 2, 3, 4, 5 the direct children are 2, 3 the output would be

 ItemId    Amount
 ------------------
 2         40     (values from ItemIds 4 & 5)
 3         60     (values from ItemId 3)

What sort of approaches should be applied to make achieve this behavior?

I am considering using a CTE, but am wondering if there is a better/faster approach.

+2  A: 

A recursive CTE like this would work, assuming your hierarchy doesn't go too deep:

declare @ParentId int;
set @ParentId = 1;

;with 
  Recurse as (
    select 
      a.Id as DirectChildId
    , a.Id
    from Item a 
    where ParentId = @ParentId
    union all
    select
      b.DirectChildId
    , a.Id
    from Item a 
    join Recurse b on b.Id = a.ParentId
    )
select
  a.DirectChildId, sum(b.Amount) as Amount
from Recurse a
left join ItemValues b on a.Id = b.ItemId
group by
  DirectChildId;

A non-CTE method would require some form of iteration, cursor-based or otherwise. Since it's a stored proc, its a possibility, and if there's a lot data to recurse through, it would probably scale better, so long as you slice the data appropriately.

If the clustered index is on Id, add a non-clustered index on ParentId. As a covering index, it will satisfy the initial seek w/out a bookmark lookup. The clustered index will then help with the recursive join.

If the clustered index is already on ParentId instead, add a non-clustered index on Id. Together, they will be virtually equivalent to the above. For ItemValues, you may want a index on (ItemId) INCLUDE (Amount), if the actual table is wider than this.

Peter
"assuming your hierarchy doesn't go too deep" it doesn't, but what would be the problem if it did?
Bob
Recursion depth is limited to 100 by default on SQL Server. You can specify as high as 32767 with the MAXRECURSION query hint, but I assume it does not scale linearly with volume above a certain threshold. It doesn't seem an issue for your situation, though, since you will be specifying one parent at a time. So long as you have it indexed properly and recursion depth is reasonable (under 100?), performance shouldn't be an issue.
Peter
Awesome, thanks!
Bob
Thanks for the update, I'd give you a second +1 if I could
Bob
No worries, it's fun.
Peter
You can go to infinite recursion with a CTE. Set MAXRECURSION = 0. It says this in BOL.
gbn
@gbn: I stand corrected. Thanks.
Peter
A: 

Could you store your data as in the nested set model (here is a MySQL reference but the ideas are generic across databases)? If so then the operations to find the value you are looking for would be fairly simple.

Brian Fisher
Unfortunatly the structure is set right now.
Bob
A: 

Does this have to be handled in the database? I would suggest bringing the necessary data into your BLL and performing the recursion there.

Rip Rowan