views:

1517

answers:

2

Hi,

Suppose I have a recursive table (e.g. employees with managers) and a list of size 0..n of ids. How can I find the lowest common parent for these ids?

For example, if my table looks like this:

Id | ParentId
---|---------
 1 |     NULL
 2 |        1
 3 |        1
 4 |        2
 5 |        2
 6 |        3
 7 |        3
 8 |        7

Then the following sets of ids lead to the following results (the first one is a corner case):

[]      => 1 (or NULL, doesn't really matter)
[1]     => 1
[2]     => 2
[1,8]   => 1
[4,5]   => 2
[4,6]   => 1
[6,7,8] => 3

How to do this?

EDIT: Note that parent isn't the correct term in all cases. It's the lowest common node in all paths up the tree. The lowest common node can also be a node itself (for example in the case [1,8] => 1, node 1 is not a parent of node 1 but node 1 itself).

Kind regards, Ronald

+2  A: 

Here's one way of doing it; it uses a recursive CTE to find the ancestry of a node, and uses "CROSS APPLY" over the input values to get the common ancestry; you just change the values in @ids (table variable):

----------------------------------------- SETUP
CREATE TABLE MyData (
   Id int NOT NULL,
   ParentId int NULL)

INSERT MyData VALUES (1,NULL)
INSERT MyData VALUES (2,1)
INSERT MyData VALUES (3,1)
INSERT MyData VALUES (4,2)
INSERT MyData VALUES (5,2)
INSERT MyData VALUES (6,3)
INSERT MyData VALUES (7,3)
INSERT MyData VALUES (8,7)

GO
CREATE FUNCTION AncestorsUdf (@Id int)
RETURNS TABLE
AS
RETURN (
    WITH Ancestors (Id, ParentId)
    AS (
     SELECT Id, ParentId
     FROM MyData
     WHERE Id = @Id
     UNION ALL
     SELECT md.Id, md.ParentId
     FROM MyData md
     INNER JOIN Ancestors a
       ON md.Id = a.ParentId
    )
    SELECT Id FROM Ancestors
);
GO
----------------------------------------- ACTUAL QUERY
DECLARE @ids TABLE (Id int NOT NULL)
DECLARE @Count int
-- your data (perhaps via a "split" udf)
INSERT @ids VALUES (6)
INSERT @ids VALUES (7)
INSERT @ids VALUES (8)

SELECT @Count = COUNT(1) FROM @ids
;
SELECT TOP 1 a.Id
FROM @ids
CROSS APPLY AncestorsUdf(Id) AS a
GROUP BY a.Id
HAVING COUNT(1) = @Count
ORDER BY a.ID DESC


Update if the nodes aren't strictly ascending:

CREATE FUNCTION AncestorsUdf (@Id int)
RETURNS @result TABLE (Id int, [Level] int)
AS
BEGIN
    WITH Ancestors (Id, ParentId, RelLevel)
    AS (
     SELECT Id, ParentId, 0
     FROM MyData
     WHERE Id = @Id
     UNION ALL
     SELECT md.Id, md.ParentId, a.RelLevel - 1
     FROM MyData md
     INNER JOIN Ancestors a
       ON md.Id = a.ParentId
    )

    INSERT @result
    SELECT Id, RelLevel FROM Ancestors

    DECLARE @Min int
    SELECT @Min = MIN([Level]) FROM @result

    UPDATE @result SET [Level] = [Level] - @Min

    RETURN
END
GO

and

SELECT TOP 1 a.Id
FROM @ids
CROSS APPLY AncestorsUdf(Id) AS a
GROUP BY a.Id, a.[Level]
HAVING COUNT(1) = @Count
ORDER BY a.[Level] DESC
Marc Gravell
Thanks. I had a little trouble understanding how it works, but I get it now. Am I correct if I say that the 'ORDER BY a.ID DESC' at the end only works because in the current configuration lower nodes have higher id's?
Ronald Wildenberg
Yes; I'll see if I can do a version that doesn't rely on this...
Marc Gravell
I have come up myself with a version that doesn't require a UDF and that uses the correct ordering. I'll post it ASAP..
Ronald Wildenberg
Hi, I marked my own answer as the answer. It's somewhat more compact and doesn't require a UDF. Thanks for pushing me in the right direction.
Ronald Wildenberg
No problem; glad it helped.
Marc Gravell
+2  A: 

After doing some thinking and some hints in the right direction from Marc's answer (thanks), I came up with another solution myself:

DECLARE @parentChild TABLE (Id INT NOT NULL, ParentId INT NULL);
INSERT INTO @parentChild VALUES (1, NULL);
INSERT INTO @parentChild VALUES (2, 1);
INSERT INTO @parentChild VALUES (3, 1);
INSERT INTO @parentChild VALUES (4, 2);
INSERT INTO @parentChild VALUES (5, 2);
INSERT INTO @parentChild VALUES (6, 3);
INSERT INTO @parentChild VALUES (7, 3);
INSERT INTO @parentChild VALUES (8, 7);

DECLARE @ids TABLE (Id INT NOT NULL);
INSERT INTO @ids VALUES (6);
INSERT INTO @ids VALUES (7);
INSERT INTO @ids VALUES (8);

DECLARE @count INT;
SELECT @count = COUNT(1) FROM @ids;

WITH Nodes(Id, ParentId, Depth) AS
(
    -- Start from every node in the @ids collection.
    SELECT pc.Id , pc.ParentId , 0 AS DEPTH
    FROM @parentChild pc
    JOIN @ids i ON pc.Id = i.Id

    UNION ALL

    -- Recursively find parent nodes for each starting node.
    SELECT pc.Id , pc.ParentId , n.Depth - 1
    FROM @parentChild pc
    JOIN Nodes n ON pc.Id = n.ParentId
)
SELECT n.Id
FROM Nodes n
GROUP BY n.Id
HAVING COUNT(n.Id) = @count
ORDER BY MIN(n.Depth) DESC

It now returns the entire path from the lowest common parent to the root node but that is a matter of adding a TOP 1 to the select.

Ronald Wildenberg
You might have to watch that Depth is from the top down, not the bottom up - which might make a difference. I haven't bothered thinking about every edge case ;-p Looks good though, +1
Marc Gravell
Depth is a little susceptible to interpretation ;-p What orientation does your tree have, for example. I always picture it upside-down (root node on top). Therefore, I subtract 1 for every level 'up' in the tree. Anyway, I think I have the right ordering (at least I hope I do)...
Ronald Wildenberg