views:

87

answers:

5

I am having extreme difficulty constructing a query which returns an XML style hierarchy.

We have a database table which contains a hierarchy of URLs for our website. The table contains the columns: ID, URL, DisplayName, ParentID, ItemOrder

The parent ID forms a recursive relationship between the current item and it's parent. The item should site below it's parent in the hierarchy and it should also be ordered using the item order against items at the same level in the hierarchy.

I have managed to get a recursive query working so it drills down the hierarchy sequentially but I cannot order this by the item order as well.

My current query is below:

WITH Parents AS
(
SELECT MenuItemId, URL, ParentItemId, ItemOrder
FROM CambsMenu

UNION ALL

SELECT si.MenuItemId, si.URL, si.ParentItemId, si.ItemOrder
FROM CambsMenu si INNER JOIN Parents p
ON si.ParentItemId = p.MenuItemId
)

SELECT DISTINCT *
FROM Parents
+1  A: 
WITH Parents AS
(
SELECT MenuItemId, URL, ParentItemId, ItemOrder, 0 AS Level, Cast((ItemOrder+1000) as Varchar(MAX)) as MatPath
FROM CambsMenu
WHERE ParentItemId IS NULL

UNION ALL

SELECT si.MenuItemId, si.URL, si.ParentItemId, si.ItemOrder, Level + 1, MathPath + '.' CAST((si.ItemOrder+1000) as Varchar(MAX)
FROM CambsMenu si INNER JOIN Parents p
ON si.ParentItemId = p.MenuItemId
)

SELECT DISTINCT *
FROM Parents
ORDER BY MatPath

EDIT: Answer updated, originally was sorting by Level, which was not asked of. Also, the answer is not tested. Updated again, the seed query was not filtering on IS NULL

EDIT2: Here's an update that will use floats and subquery to get the maximum number of leafs/branches; an assumption is made that the ItemOrder is ascending, starting with 1, with no holes and that it is restarted for each parent. This could be converted back to using integers as then it will be more obvious how the sorting can overflow/loose precision with number of levels.

WITH Hierarchy AS
(
  SELECT MenuItemID, 
         URL, 
         ParentItemId, 
         ItemOrder,
         0 as level, 
         cast(1 as float) as hord
  FROM CambsMenu
  WHERE ParentItemId IS NULL 
UNION ALL
  SELECT r.MenuItemId, 
         r.URL, 
         r.PrentItemId,
         r.ItemOrder, 
         h.level + 1, 
         h.hord + r.ItemOrder/power(
           (SELECT MAX(n)+1 
            FROM (SELECT count(*) AS n 
                  FROM CambsMenu 
                  GROUP BY ParentItemId) t), h.level+1)
  FROM CambsMenu r INNER JOIN Hierarchy h
  ON r.ParentItemId = h.MenuItemId
)
SELECT *
FROM Hierarchy
ORDER BY hord;
Unreason
But in this case, you get all items on level 0, then all on level 1, then all on level 2 and so forth. You're not getting parent 0, then all its level-1 children, and their level-2 grandchildren included.....
marc_s
Thanks for the reply. Marc is right though, the ordering is incorrect and I also found that the level is not incremented correctly.
Ben
@marc_s, yes I realized I misread the question, updated to build the materialized path (with assumption of max 8999 items per branch)
Unreason
@Ben, answer is updated; if the syntax is wrong please try to correct yourself, I don't have MS SQL laying around to test it; syntax snipped from http://smehrozalam.wordpress.com/2009/06/05/tsql-challenge-8-using-recursive-cte-for-a-hierarchical-relationship/ and http://msdn.microsoft.com/en-us/library/ms186243.aspx
Unreason
+1  A: 

Is the number of siblings a known value? Is the number of levels known? If so, you can perform operations over the ItemOrder, to guarantee that every item has a unique ItemOrder, and then just sort by that value.

For example, suppose that any item can't have more than 10 childs (ItemOrder ranges from 0 to 9) and there are at most 5 levels. What I'm going to do now, is to make the first parent ItemOrder to be 10000 time it's current item order, ant it's childer ItemOrder would be 1000 times it's current ItemOrder plus it's parent ItemOrder, and so on, removing a 0 each time you go a level down.

WITH Parents AS
(
SELECT MenuItemId,
    URL,
    ParentItemId,
    (ItemOrder * 10000) AS ItemOrder,
    10000 AS Multiplier
FROM CambsMenu
WHERE ParentItemId IS NULL

UNION ALL

SELECT si.MenuItemId,
    si.URL,
    si.ParentItemId,
    (p.ItemOrder + si.ItemOrder * p.Multiplier/ 10) as ItemOrder,
    (p.Multiplier / 10) as Multiplier
FROM CambsMenu si INNER JOIN Parents p
ON si.ParentItemId = p.MenuItemId
)

SELECT * FROM Parents ORDER BY ItemOrder

If the number of levels or children is unknown, you can go with a similar approach but instead of building a numeric ItemOrder you can build a string ItemOrder, guaranteeing that the string '1.10.20' is lower than the string '2.1'

Fede
Thanks for your help guys. I tried doing it using strings as you suggested Fede but the trouble I found was that the 10.1 is ranked before something like than 2.1 because numbers are ordered before any other string values.I might have to pad it out with preceding zeros.
Ben
@Ben, yes that will always be a bit of a problem; however since you have the ItemOrder field you could use an approach similar to my last answer and precompute the sort order (if your data is rarely updated/inserted)
Unreason
A: 

SQL does not support a 'hierarchy' or 'tree' or 'graph' type, because SQL/the relational model were essentially invented with the purpose of rendering (the need for) those types obsolete.

You have written a query that computes what is known in mathematical terms as a "transitive closure". I doubt that this is really what you want. If a relation ("table") has the pairs (1 2) and (2 3), then your query would include the resulting pair (1 3). However, (in this example) I suspect that you wouldn't want your XML-style result to include a tag holding the number 3 as a direct child of number 1 ...

I suspect what you want is more likely to be achieved by using the GROUP operator of the relational algebra. Caveat : this is not really the same thing as "GROUP BY" (the GROUP operator of the relational algebra produces tables that contains columns whose value is itself a table - e.g. a table holding all the direct children of some parent), and it is quite likely that your particular DBMS doesn't support it, in which case you're left pretty much "abandoned by your DBMS" and "with no other option than to code all the freaking shit (by this I mean the recursion in particular) yourself".

Erwin Smout
SQL supports recursion in new standards, his query would not return "transitive closure" (look up recursion and work out the question; the recursive query joins on parentid=id) and he uses DISTINCT so he'll be fine. Also, there are other ways to deal with hierarchies in relational systems (http://onlamp.com/pub/a/onlamp/2004/08/05/hierarchical_sql.html). Also, a claim that relational model is invented to render those types obsolete is bogus; though it addressed problems that network and hierarchical model had. However, there is nothing wrong with hierarchical relationships by themselves.
Unreason
+1  A: 

Hi All

Thanks for all your responses!

Just for completeness, here is the final solution that I came up with. It creates a string which is separated into sub sections by dots. The solution below will only support upto 9999 items in the root node but you could easily extend this by increasing the number of leading zeros by simply changing the number in the STR(ItemOrder,4) command

WITH Parents AS
(
SELECT MenuItemId,
    URL,
    ParentItemId,
    DisplayName,
    OpenInNewWindow,
    ItemOrder,
    CAST((REPLACE(STR(ItemOrder,4),' ','0')) AS nvarchar(max)) AS OrderString
FROM CambsMenu
WHERE ParentItemId IS NULL

UNION ALL

SELECT si.MenuItemId,
    si.URL,
    si.ParentItemId,
    si.DisplayName,
    si.OpenInNewWindow,
    si.ItemOrder,
    (p.OrderString + '.' + CAST((REPLACE(STR(si.ItemOrder,4),' ','0')) AS nvarchar(max))) AS OrderString
FROM CambsMenu si INNER JOIN Parents p
ON si.ParentItemId = p.MenuItemId
)

SELECT * FROM Parents ORDER BY OrderString ASC
Ben
How to do, if you don't need to override ItemOrder all the time? Anyway.. if all your rows has ItemOrder, your solution will do.
Michael Buen
+2  A: 

Normal hierachical approach:

select *
into emp
from 
(values
(1, 'President', NULL),
(2, 'Vice President', 1),
(3, 'CEO', 2),
(4, 'CTO', 2),
(5, 'Group Project Manager', 4),
(6, 'Project Manager 1', 5),
(7, 'Project Manager 2', 5),
(8, 'Team Leader 1', 6),
(9, 'Software Engineer 1', 8),
(10, 'Software Engineer 2', 8),
(11, 'Test Lead 1', 6),
(12, 'Tester 1', 11),
(13, 'Tester 2', 11),
(14, 'Team Leader 2', 7),
(15, 'Software Engineer 3', 14),
(16, 'Software Engineer 4', 14),
(17, 'Test Lead 2', 7),
(18, 'Tester 3', 17),
(19, 'Tester 4', 17),
(20, 'Tester 5', 17)
) as x(emp_id, emp_name, mgr_id)

Query:

with recursive org(emp_id, emp_name, emp_level, mgr_id, sort) as
(
 select  
  a.emp_id, a.emp_name, 0, a.mgr_id,  
  a.emp_name
 from emp a
 where a.mgr_id is null

 union all

 select 
  b.emp_id, b.emp_name, emp_level + 1, b.mgr_id, 

  sort || ' : ' || b.emp_name 

 from emp b
 join org on org.emp_id = b.mgr_id
)
select 
 emp_id, repeat(' ', emp_level * 2) || emp_name as emp_name, sort 
from org
order by sort

Output:

 emp_id |            emp_name             |                                                        sort                                                        
--------+---------------------------------+--------------------------------------------------------------------------------------------------------------------
      1 | President                       | President
      2 |   Vice President                | President : Vice President
      3 |     CEO                         | President : Vice President : CEO
      4 |     CTO                         | President : Vice President : CTO
      5 |       Group Project Manager     | President : Vice President : CTO : Group Project Manager
      6 |         Project Manager 1       | President : Vice President : CTO : Group Project Manager : Project Manager 1
      8 |           Team Leader 1         | President : Vice President : CTO : Group Project Manager : Project Manager 1 : Team Leader 1
      9 |             Software Engineer 1 | President : Vice President : CTO : Group Project Manager : Project Manager 1 : Team Leader 1 : Software Engineer 1
     10 |             Software Engineer 2 | President : Vice President : CTO : Group Project Manager : Project Manager 1 : Team Leader 1 : Software Engineer 2
     11 |           Test Lead 1           | President : Vice President : CTO : Group Project Manager : Project Manager 1 : Test Lead 1
     12 |             Tester 1            | President : Vice President : CTO : Group Project Manager : Project Manager 1 : Test Lead 1 : Tester 1
     13 |             Tester 2            | President : Vice President : CTO : Group Project Manager : Project Manager 1 : Test Lead 1 : Tester 2
      7 |         Project Manager 2       | President : Vice President : CTO : Group Project Manager : Project Manager 2
     14 |           Team Leader 2         | President : Vice President : CTO : Group Project Manager : Project Manager 2 : Team Leader 2
     15 |             Software Engineer 3 | President : Vice President : CTO : Group Project Manager : Project Manager 2 : Team Leader 2 : Software Engineer 3
     16 |             Software Engineer 4 | President : Vice President : CTO : Group Project Manager : Project Manager 2 : Team Leader 2 : Software Engineer 4
     17 |           Test Lead 2           | President : Vice President : CTO : Group Project Manager : Project Manager 2 : Test Lead 2
     18 |             Tester 3            | President : Vice President : CTO : Group Project Manager : Project Manager 2 : Test Lead 2 : Tester 3
     19 |             Tester 4            | President : Vice President : CTO : Group Project Manager : Project Manager 2 : Test Lead 2 : Tester 4
     20 |             Tester 5            | President : Vice President : CTO : Group Project Manager : Project Manager 2 : Test Lead 2 : Tester 5
(20 rows)

Now let's override sorting on Group Project Managers, let's make Project Manager 2 come before 1, and Project Manager 1 come after Project Manager 2. Let's also make tester 4 comes before 3, and tester 3 comes after tester 4

alter table emp add column order_override int null;

update emp set order_override = 1 where emp_id = 7; -- PM 2
update emp set order_override = 2 where emp_id = 6; -- PM 1

update emp set order_override = 1 where emp_id = 19; -- Tester 4
update emp set order_override = 2 where emp_id = 18; -- Tester 3

Query:

with recursive org(emp_id, emp_name, emp_level, mgr_id, sort) as
(
 select  
  a.emp_id, a.emp_name, 0, a.mgr_id,  
  a.emp_name
 from emp a
 where a.mgr_id is null

 union all

 select 
  b.emp_id, b.emp_name, emp_level + 1, b.mgr_id, 

  sort || ' : ' || coalesce( lpad(order_override::text, 10, '0'), b.emp_name )
 from emp b
 join org on org.emp_id = b.mgr_id
)
select 
 emp_id, repeat(' ', emp_level * 2) || emp_name as emp_name, sort 
from org
order by sort

Output:

 emp_id |            emp_name             |                                                    sort                                                     
--------+---------------------------------+-------------------------------------------------------------------------------------------------------------
      1 | President                       | President
      2 |   Vice President                | President : Vice President
      3 |     CEO                         | President : Vice President : CEO
      4 |     CTO                         | President : Vice President : CTO
      5 |       Group Project Manager     | President : Vice President : CTO : Group Project Manager
      7 |         Project Manager 2       | President : Vice President : CTO : Group Project Manager : 0000000001
     14 |           Team Leader 2         | President : Vice President : CTO : Group Project Manager : 0000000001 : Team Leader 2
     15 |             Software Engineer 3 | President : Vice President : CTO : Group Project Manager : 0000000001 : Team Leader 2 : Software Engineer 3
     16 |             Software Engineer 4 | President : Vice President : CTO : Group Project Manager : 0000000001 : Team Leader 2 : Software Engineer 4
     17 |           Test Lead 2           | President : Vice President : CTO : Group Project Manager : 0000000001 : Test Lead 2
     19 |             Tester 4            | President : Vice President : CTO : Group Project Manager : 0000000001 : Test Lead 2 : 0000000001
     18 |             Tester 3            | President : Vice President : CTO : Group Project Manager : 0000000001 : Test Lead 2 : 0000000002
     20 |             Tester 5            | President : Vice President : CTO : Group Project Manager : 0000000001 : Test Lead 2 : Tester 5
      6 |         Project Manager 1       | President : Vice President : CTO : Group Project Manager : 0000000002
      8 |           Team Leader 1         | President : Vice President : CTO : Group Project Manager : 0000000002 : Team Leader 1
      9 |             Software Engineer 1 | President : Vice President : CTO : Group Project Manager : 0000000002 : Team Leader 1 : Software Engineer 1
     10 |             Software Engineer 2 | President : Vice President : CTO : Group Project Manager : 0000000002 : Team Leader 1 : Software Engineer 2
     11 |           Test Lead 1           | President : Vice President : CTO : Group Project Manager : 0000000002 : Test Lead 1
     12 |             Tester 1            | President : Vice President : CTO : Group Project Manager : 0000000002 : Test Lead 1 : Tester 1
     13 |             Tester 2            | President : Vice President : CTO : Group Project Manager : 0000000002 : Test Lead 1 : Tester 2
(20 rows)

Without the sort column in data projection:

with recursive org(emp_id, emp_name, emp_level, mgr_id, sort) as
(
 select  
  a.emp_id, a.emp_name, 0, a.mgr_id,  
  a.emp_name
 from emp a
 where a.mgr_id is null

 union all

 select 
  b.emp_id, b.emp_name, emp_level + 1, b.mgr_id, 

  sort || ' : ' || coalesce( lpad(order_override::text, 10, '0'), b.emp_name )
 from emp b
 join org on org.emp_id = b.mgr_id
)
select 
 emp_id, repeat(' ', emp_level * 2) || emp_name as emp_name
from org
order by sort

Output:

 emp_id |            emp_name             
--------+---------------------------------
      1 | President
      2 |   Vice President
      3 |     CEO
      4 |     CTO
      5 |       Group Project Manager
      7 |         Project Manager 2
     14 |           Team Leader 2
     15 |             Software Engineer 3
     16 |             Software Engineer 4
     17 |           Test Lead 2
     19 |             Tester 4
     18 |             Tester 3
     20 |             Tester 5
      6 |         Project Manager 1
      8 |           Team Leader 1
      9 |             Software Engineer 1
     10 |             Software Engineer 2
     11 |           Test Lead 1
     12 |             Tester 1
     13 |             Tester 2
(20 rows)

Project Manager 2 comes before Project Manager 1. Tester 4 comes before Tester 3

The technique lies in numeric text substitution for b.name if there's an order_override(non-null):

sort || ' : ' || coalesce( lpad(order_override::text, 10, '0'), b.emp_name )

Above code is Postgres, to convert to Sql Server, remove the word RECURSIVE, change REPEAT to REPLICATE, || to +.

Equivalent of...

lpad(order_override::text, 10, '0')

...is:

RIGHT( REPLICATE('0',10) + CONVERT(VARCHAR, order_override), 10)
Michael Buen