tags:

views:

1369

answers:

2

I have a set of data that models a hierarchy of categories. A root category contains a set of top-level categories. Each top-level category contains a set of sub-categories.

Each sub category has a set of organizations. A given organization can appear in multiple sub categories.

The leaf nodes of this hierarchy are organizations. An organization can potentially appear in multiple sub-categories.

The data is stored in three SQL tables:

organizations
organization_id organization_name
1               Org A
2               Org B
3               Org C
4               Org D
5               Org E
6               Org F

categories
category_id parent_id category_name
0           NULL      Top Level Category
1           0         First Category
2           0         Second Category
3           1         Sub Category A
4           1         Sub Category B
5           1         Sub Category C
6           2         Sub Category D

organizations_categories -- Maps organizations to sub_categories
organization_id category_id
1               3
2               3
2               6
3               4
4               4
5               4
6               5
6               4
7               6
8               6

I would like to be able to select a list of all unique organizations under a given category or sub-category.

The way I'm doing it right now involves first figuring out which sub categories have been requested and then looping through each sub_category in code and performing a select to get all organizations mapped to that category. The results of each select are appended to an array. This array contains duplicates whenever an organization appears in multiple sub categories.

I would love to replace this kludge with a query that can efficiently select a list of distinct organizations given an id of one of the categories in the hierarchy.

I am devloping this solution using PHP and MySQL.

Thanks for your time and suggestions.

+4  A: 

Assuming that your hierarchy is always exactly 3 levels deep:

SELECT DISTINCT
     O.organization_id,
     O.organization_name
FROM
     Categories CAT
INNER JOIN Categories SUB ON
     SUB.parent_id = CAT.category_id
INNER JOIN Category_Organizations CO ON
     CO.category_id = SUB.category_id
INNER JOIN Organizations O ON
     O.organization_id = CO.organization_id
WHERE
     CAT.category_id = @category_id

You can modify that by one level to allow you to pass a sub category id. If you don't know at the time whether or not you have a category id or a sub category id then you can do the following:

SELECT DISTINCT
     O.organization_id,
     O.organization_name
FROM
     Categories CAT
LEFT OUTER JOIN Categories SUB ON
     SUB.parent_id = CAT.category_id
INNER JOIN Category_Organizations CO ON
     CO.category_id IN (CAT.category_id, SUB.category_id)
INNER JOIN Organizations O ON
     O.organization_id = CO.organization_id
WHERE
     CAT.category_id = @category_id

If your hierarchy may have an unknown number of levels (or you think it might in the future) then check out Joe Celko's Trees and Hierarchies in SQL for Smarties for alternative ways to model a hierarchy. It's probably a good idea to do that anyway.

Tom H.
That's it. Great solution. Thank you very kindly.
GloryFish
A: 

Not sure if your data model will allow it, but you can use a single index column and a Binary Tree to easily store this information in a single 'OrganizationTree' table. Also has the benefit you use a single query with no modifications to search at the category, subcategory, or organization levels (E.g. give me all results of X subcategory)

Hope this helps.

Adam.