views:

280

answers:

4

Hi there,

Suposse you have to following tables where a sale consists of products and a product can be placed in multiple categories. Whereby categories have a hierachly structure like:

Man
 Shoes
  Sport
  Casual
 Watches
Women
 Shoes
  Sport
  Casual
 Watches

Tables:

Sale:
    id name 
    1  Sale1

Product:
    id saleidfk name 
    1  1        a
    2  1        b
    3  1        c
    4  1        d
    5  1        e

ProductCategory :
    productid categoryid 
    1         3
    2         3           
    3         4
    4         5
    5         10     

Category:
    id ParentCategoryIdFk name 
    1  null               Men
    2  1                  Shoes
    3  2                  Sport
    4  2                  Casual
    5  1                  Watches
    6  null               Women
    7  6                  Shoes
    8  7                  Sport
    9  7                  Casual
    10 6                 Watches

Question:

Now on my website I want to create a control where only the categories are shown of a certain sale and where the categories are filled with the products of the sale. I also want to include the hierachly structure of the categories. So if we have a leave node, recusivly go up to the top node.

So with sale1 I should have a query with the following result:

Men
  Shoes
    Sport
    Casual
  Watches
Women
  Watches

This thing is driving me crazy :-)

Thanks in advance!

Gr

Martijn

A: 

Try something like this - the basic CTE to get a hierarchical listing of your categories would be similar to this:

WITH Categories AS
(
    SELECT Cat.ID, Cat.NAME, Cat.ParentCategoryID, CAST('none' AS VARCHAR(50)) AS 'ParentCategory', 1 AS 'Level'
    FROM dbo.MBCategory Cat
    WHERE Cat.ParentCategoryID IS NULL

    UNION ALL

    SELECT Cat.ID, Cat.NAME, Cat.ParentCategoryID, c2.NAME AS 'ParentCategory', LEVEL + 1
    FROM dbo.MBCategory CAT
    INNER JOIN Categories c2 ON cat.ParentCategoryID = c2.ID
)
SELECT * FROM Categories

Now what you need to do is join your other tables to this CTE, to get the following query in the end:

WITH Categories AS
(
    SELECT Cat.ID, Cat.NAME, Cat.ParentCategoryID, CAST('none' AS VARCHAR(50)) AS 'ParentCategory', 1 AS 'Level'
    FROM dbo.MBCategory Cat
    WHERE Cat.ParentCategoryID IS NULL

    UNION ALL

    SELECT Cat.ID, Cat.NAME, Cat.ParentCategoryID, c2.NAME AS 'ParentCategory', LEVEL + 1
    FROM dbo.MBCategory CAT
    INNER JOIN Categories c2 ON cat.ParentCategoryID = c2.ID
)
SELECT DISTINCT s.*, c.*
FROM dbo.Sale s
INNER JOIN dbo.Product p ON p.SaleID = s.ID
INNER JOIN dbo.ProductCategory pc ON p.ID = pc.ProductID
INNER JOIN Categories c ON pc.CategoryID = c.ID
ORDER BY Level

This gives me a resulting output something like:

ID  Name   CatID  CatName  ParentCatID  ParentCatName Level
 1  Sale1    5    Watches      1            Men         2
 1  Sale1   10    Watches      6            Women       2 
 1  Sale1    3    Sport        2            Shoes       3
 1  Sale1    3    Sport        2            Shoes       3
 1  Sale1    4    Casual       2            Shoes       3
marc_s
It comes close but there are missing bits. Because I dont know for example if shoes has a parent as well.
Martijn B
I doubt you'll be able to get this whole hierarchy thing (up any number of levels) to work in a SQL query.......
marc_s
+1  A: 

I think you'll get the fastest performance and also much cleaner SQL queries if you create an additional table that lists all the ancestor categories (parent, grandparent, etc.) for every category, something like this:

CategoryAncestor
ID   categoryid   ancestorid    
1         1            1      -- Men, obligatory self reference (makes queries easier)
2         2            2      -- Shoes, self reference 
3         2            1      -- Shoes is a subcategory of Men
4         3            3      -- Sport, self reference 
5         3            2      -- Sport is a subcategory of Shoes
6         3            1      -- Sport is ALSO a subcategory of Men
-- etc.

This will create a little more SQL overhead when you insert new categories or delete them, but will allow you to run you query much faster.

The next thing you may want to consider doing is adding rank and level columns to categories (again, more work when creating and deleting categories):

id ParentCategoryIdFk name       level    rank
1  null               Men           0       1
2  1                  Shoes         1       2
3  2                  Sport         2       3
4  2                  Casual        2       4
5  1                  Watches       1       5
6  null               Women         0       6
7  6                  Shoes         1       7
8  7                  Sport         2       8
9  7                  Casual        2       9
10 6                 Watches        1      10

The rank column specifies the sort order.

Then, you can simply run the following query:

SELECT * FROM Category c
  WHERE c.id IN (
    SELECT ancestorid FROM CategoryAncestor ca, ProductCategory pc, Product p
      WHERE p.id = pc.productid 
        AND pc.categoryid = ca.categoryid
        AND p.saleidfk = 1
    )
  ORDER BY rank

Hope this helps.

Zach Smith
Thanks for your time, your true but I am not in the position to change this now. Thanks do!
Martijn B
A: 

Is a bit messy, but:

DROP TABLE #Sale
GO
DROP TABLE #PRoduct
GO
DROP TABLE #ProductCategory
GO
DROP TABLE #Category
GO
CREATE TABLE #Sale 
(
    ID INT,
    Name VARCHAR(20)        
    )
GO
INSERT INTO #Sale SELECT 1, 'Sale1'
GO
CREATE TABLE #Product 
(
ID INT,
saleidfk INT,
name VARCHAR(20)
)
GO
INSERT INTO #Product 
SELECT 1,1,'a'
UNION
SELECT 2,1,'b'
UNION
SELECT 3,1,'c'
UNION
SELECT 4,1,'d'
UNION
SELECT 5,1,'e'
UNION
SELECT 6,1,'f'
GO
CREATE TABLE #ProductCategory 
(
ProductID INT,
CategoryID INT
)
GO
INSERT INTO #ProductCategory
SELECT 1,3
UNION
SELECT 2,3
UNION
SELECT 3,4
UNION
SELECT 4,5
UNION
SELECT 5,10
UNION
SELECT 6,10
GO
CREATE TABLE #Category 
(
ID INT,
ParentCategoryFK INT,
Name varchar(20)
)
GO
INSERT INTO #Category
SELECT 1,NULL,'Men'
UNION
SELECT 2,1,'Shoes'
UNION
SELECT 3,2,'Sport'
UNION
SELECT 4,2,'Casual'
UNION
SELECT 5,1,'Watches'
UNION
SELECT 6,NULL,'Women'
UNION
SELECT 7,6,'Shoes'
UNION
SELECT 8,7,'Sport'
UNION
SELECT 9,7,'Casual'
UNION
SELECT 10,6,'Watches'


GO

WITH Categories (CategoryName,CategoryID, [Level], SortOrder)  AS 
( 
    SELECT  Cat.Name,cat.id, 1 AS [Level], CONVERT(VARCHAR(MAX), ROW_NUMBER() OVER (order by cat.Name) ) AS SortOrder
    FROM #Category Cat 
    WHERE Cat.ParentCategoryFK IS NULL 

    UNION ALL

    SELECT CAT.Name,cat.ID, [Level] + 1, c2.SortOrder + CONVERT(VARCHAR(MAX), ROW_NUMBER() OVER (order by cat.Name)) 
    FROM #Category CAT 
    INNER JOIN Categories c2 ON cat.ParentCategoryFK = c2.CategoryID
) 
SELECT #Sale.Name, Categories.CategoryName, #Product.name,Categories.Level,Categories.SortOrder FROM
Categories 
LEFT JOIN
#ProductCategory ON #ProductCategory.CategoryID = Categories.CategoryID
LEFT JOIN
#Product ON #Product.ID = #ProductCategory.ProductID
LEFT JOIN
#Sale ON #Product.saleidfk = #Sale.ID
ORDER BY Categories.SortOrder, #Product.name

The pertinent points to notice are that to get the full hierarchy to make sense, you need the categories whether they have products or not. Also the varchar for the SortOrder allows the hierarchy to display in the correct order.

Nat
A: 

It's not particularly efficient, but if what you want to do is effectively "explode" the entire hierarchy and get the results in sequence from parent to leaf, something like this would do it:

WITH CategoryHierarchy AS
(
    SELECT
        ID, ParentCategoryIdFk, 0 AS Level,
        ROW_NUMBER() OVER (ORDER BY ID) AS SubTreeID
    FROM Category
    WHERE CategoryID IN
    (
        SELECT pc.CategoryID
        FROM Sale s
        INNER JOIN Product p
            ON p.saleidfk = s.id
        INNER JOIN ProductCategory pc
            ON pc.productid = p.id
        WHERE s.id = @SaleID
    )

    UNION ALL

    SELECT c.ID, c.ParentCategoryIdFk, h.Level + 1, h.SubTreeID
    FROM CategoryHierarchy h
    INNER JOIN Category c
        ON c.ID = h.ParentID
)
SELECT c.ID, c.ParentCategoryIdFk AS ParentID, c.Name
FROM CategoryHierarchy h
INNER JOIN Category c
    ON c.ID = h.ID
ORDER BY h.SubTreeID ASC, h.Level DESC

This should get you results similar to the following:

ID | ParentID | Name
---+----------+----------
1  |     NULL | Men
2  |        1 | Shoes
3  |        2 | Sport
---+----------+----------
1  |     NULL | Men
2  |        1 | Shoes
4  |        2 | Casual
---+----------+----------
1  |     NULL | Men
5  |        1 | Watches
---+----------+----------
6  |     NULL | Women
10 |        6 | Watches

Of course the actual results won't have separators like that, I've added those to make the meaning of the results clearer.

If you don't want it completely exploded like this, you can use another rownum to only return the first instance of each parent:

WITH CategoryHierarchy AS
(
    SELECT
        ID, ParentCategoryIdFk, 0 AS Level,
        ROW_NUMBER() OVER (ORDER BY ID) AS SubTreeID
    FROM Category
    WHERE CategoryID IN
    (
        SELECT pc.CategoryID
        FROM Sale s
        INNER JOIN Product p
            ON p.saleidfk = s.id
        INNER JOIN ProductCategory pc
            ON pc.productid = p.id
        WHERE s.id = @SaleID
    )

    UNION ALL

    SELECT c.ID, c.ParentCategoryIdFk, h.Level + 1, h.SubTreeID
    FROM CategoryHierarchy h
    INNER JOIN Category c
        ON c.ID = h.ParentID
),
Filter_CTE AS
(
    SELECT
        ID, Level, SubTreeID
        ROW_NUMBER() OVER (PARTITION BY ID ORDER BY SubTreeID) AS RowNum
    FROM CategoryHierarchy
)
SELECT c.ID, c.ParentCategoryIdFk AS ParentID, c.Name
FROM Filter_CTE f
INNER JOIN Category c
    ON c.ID = f.ID
WHERE f.RowNum = 1
ORDER BY f.SubTreeID ASC, f.Level DESC

...will give you results similar to:

ID | ParentID | Name
---+----------+----------
1  |     NULL | Men
2  |        1 | Shoes
3  |        2 | Sport
4  |        2 | Casual
5  |        1 | Watches
6  |     NULL | Women
10 |        6 | Watches

Note: Be careful with the second version, as it is not necessarily guaranteed to return results in hierarchical order. It just so happens that this version does because the IDs themselves are in hierarchical order. You can get around this limitation, but it would add a lot more complexity to this already-somewhat-complex query.

The second version does guarantee that a master category will always appear before any of its subcategories, which is fine if you plan to build a recursive data structure using a dictionary. It just might not be suitable for faster stack-based tree building or direct-to-user reporting. For those purposes, you would want to use the first version instead.

Aaronaught
Thanks I will check it out and start from this.
Martijn B