views:

202

answers:

5

I use SQL Server 2000.

Suppose I have two tables like the following:

Area
----------------------------------
ID| Name   | HierarchyLevel
----------------------------------
1 | World  |     1
2 | America|     2
3 | Europe |     2
4 | Africa |     2
5 | USA    |     3

and

AreaHierarchy
------------------------
ID | ParentID | ChildID
------------------------
 1 |   1      |    2
 2 |   1      |    3
 3 |   1      |    4
 4 |   2      |    5

where

AreaHierarchy.ParentID and AreaHierarchy.ChildID are FKs of Area.ID

How can I find the nth parent of USA?

Is it possible without looping?

Probably not.

+1  A: 

Should work

CREATE PROCEDURE find_nth_parent 
    @id INT,
    @level INT
AS
BEGIN
    SET NOCOUNT ON;

    DECLARE @counter INT
    SET @counter = 1

    DECLARE @currentItem INT
    DECLARE @currentItemNew INT

    SET @currentItem = @id

    WHILE @counter <= @level
    BEGIN
     SET @currentItemNew = NULL
     SELECT @currentItemNew = ParentID FROM AreaHierarchy WHERE ChildId = @currentItem
     IF @currentItemNew IS NULL
     BEGIN
      SELECT NULL
      RETURN 
     END
     SET @currentItem = @currentItemNew
     SET @counter = @counter + 1
    END
    SELECT @currentItem
END

Calling

EXEC find_nth_parent 5,2

returns 1 which means "World" (2nd parent), calling

EXEC find_nth_parent 5,1

return 2, which means "America" (1st parent).

Hope it helps

Lukasz Lysik
Is it possible without Looping?
JMSA
it's possible without looping...in that you can create a SQL string with as many levels of joins as you need and then execute that dynamically...pain though
David Andres
The is no single query that could do this for "n depth" without using Common Table Experssions. That still has to do recurssion, whcih is essnetially a novel kind of loop. But it avoids T-SQL loops, at which TSQL itself isn't actually that efficient. In CTEs the loop is inside the database engine, which is much much faster...
Dems
+1  A: 

You could use recursion. If you have SQL Server 2005 or newer you can use Common Table Expressions. If not you realistically need to use User Defined Functions.


An example of a UDF to do that could be...

CREATE FUNCTION get_nth_parent(area_id AS INT, n as INT)
RETURNS INT
AS

IF (n = 0) RETURN area_id

DECLARE @return INT
SELECT
   @return = dbo.get_nth_parent(AreaHierarchy.ParentID, n-1)
FROM
   AreaHierarchy
WHERE
   ChildID = area_id

RETURN @return


An example using Common Table Experessions could be...

DECLARE @hierarchy TABLE (
   parent_id  INT,
   child_id   INT
)
INSERT INTO @hierarchy SELECT 1,2
INSERT INTO @hierarchy SELECT 1,3
INSERT INTO @hierarchy SELECT 1,4
INSERT INTO @hierarchy SELECT 2,5


;WITH
   relative_distance (
      child_id,
      parent_id,
      distance
   )
AS
(
   SELECT
      child_id,
      parent_id,
      1
   FROM
      @hierarchy

   UNION ALL

   SELECT
      [relative_distance].child_id,
      [hierarchy].parent_id,
      [relative_distance].distance + 1
   FROM
      [relative_distance]
   INNER JOIN
      @hierarchy AS [hierarchy]
         ON [hierarchy].child_id = [relative_distance].parent_id
)

SELECT
   parent_id
FROM
   [relative_distance]
WHERE
   child_id = 5
   AND distance = 2
Dems
+1  A: 

No loops, no recursion

The best thing is to add additional field in your second table, that would be called ie. Parents and would simply store parent IDs in a string like:

AreaHierarchy
------------------------------------
ID | ParentID | ChildID | Parents
------------------------------------
 1 |    1     |    2    | 1/
 2 |    1     |    3    | 1/
 3 |    1     |    4    | 1/
 4 |    2     |    5    | 1/2/

This way you can easily get to any parent in the branch without recursion or any other complicated procedure. The cost in processing is very small you just copy parent's Parents value and add one more ID. And since you probably need to read more than write/update, this is the best solution to your problem.

And if I were you, I'd just keep one table for the data you have. Join both tables into one. Level could also be computed based on counting slashes in Parents varchar value but I wouldn't recommend doing that.

Additional 'catch' you should be aware of

If your data is mostly reads/writes and much less updates, this structure is really performant. But if your table does a lot more updates than read/writes, you should avoid this technique. Why? Imagine you have a very deep tree with lots of children. Changing a parent of some node high up in near the root would mean you should update Parents of the whole subtree nodes.

Robert Koritnik
Why use only one table? Querying becomes hard in that way.
JMSA
For large hierarchies, creating and maintain this "parents" field can become very inefficient. But for relatively static structures, this 'precalculation' is quite quick. Where the string becomes long, parsing it can get realtively slow, maybe even slower than a recursive CTE.
Dems
@Dems, this technique should work well in conjunction with App programming languages like C#.
JMSA
For large hierarchies this "parents" feature can become really deep recursive call as well, doesn't it? As I said it depends on what his data is about to accomplish.
Robert Koritnik
That's true, you could parse the string in the applicaiton. But then you tightly bind the database implementation to the application, which is bad practice in my world. It's all about scale and maintainability though.
Dems
Why in the world would one table make queries harder? His structure assumes one parent per node, so it is in no way more complicated.
Robert Koritnik
I agree, having a second table to hold this data is more normalised but superfluous (in my opinion) and certainly doesn't make anything any harder...
Dems
... or easier either.
Robert Koritnik
that's what I meant. 2 tables = superfluous. 1 table != harder. :)
Dems
@Dems, for this moment this technique works well for me. So I am accepting this.
JMSA
If you're going to maintain this field (probably using a trigger?) then consider going the whole hog. Create a second table which has the fields [child_id,parent_id,depth] and precalculate every tree traversal. That would slow any alterations to the tree, use more disk space, but also give the fastest route to answering the nth_parent question.
Dems
@JMSA: also be aware of the additional 'catch' I've put in my answer.
Robert Koritnik
A: 

In SQL Server 2005+, you'd use a CTE in a function:

create function get_parent(@child as int, @parent_level as int)
returns int
as
begin
    declare @parent int

    ;with parentage as (
         select 
             h.parent_id, 
             h.child_id,
             0 as level
         from 
             areahierarchy h
         where
             h.child_id = @child
         union all
         select
             h.parent_id,
             h.child_id,
             p.level + 1 as level
         from
             areahierarchy h
             inner join parentage p on
                 h.parent_id = p.child_id
         where
             p.level < @parent_level
    )

    select @parent = p.child_id from parentage p 
    where level = (select max(level) from parentage)

    return @parent
end
Eric
oh come on, use a better CTE name than "p" and I'll guive you +1! That's just sloppy and lazy :)
Dems
and while doing it you can omit the semicolon, can't you?
Robert Koritnik
No, CTE's need the preceding TSQL statement to terminate with a semi colon (afaik). I guess it's legacy due to other uses of the WITH key word. I'm convinced they could have found a better keyword, but I wasn't there to see why they chose WITH :)
Dems
don't really know. I've never used semicolons in the past and CTEs worked as well. But I don't know whether there were any sentences before them. Maybe they started my SPs. Yours starts with a declaration.
Robert Koritnik
Changed to be `parentage`. Also, you can omit the `;`, but it's just a force of habit.
Eric
@Dems: WITH is the ANSI standard keyword. I think it originated in DB2, but don't quote me. ;)
Shannon Severance
A: 

I understand that you want support back to SQL Server 2000, but I think it should be noted that the SQL Server 2008 Hierarchy ID function GetAncestor() does exactly what you're looking for.

Paul Keister