views:

771

answers:

4

Hello,

I'm working with mySQL, and I'm in a situation where I need to select data from one table that matches an ID at any level in parent -> child data hierarchy in the other table.

Further more, I would like to resolve this with a well written SQL query, rather than a recursive function in my PHP code, as this feature will be used quite a bit.

I did try searching, and I have stumbled upon numerous similar problems (most of them being resolved), however none of them helped me.

To help illustrate the situation, here's my current setup

table "articles":

  • article_id
  • category_id
  • ...

table categories

  • category_id
  • parent_id
  • ...

I need to select all the articles from "articles" where "articles.category_id" is, let's say, 10. But also receive all the articles from all categories from the tree the "categories.category_id" 10 belongs to.

Meaning, where "10" is the parent and all of it's children, and upwards where 10 is the child and all of it's parents.

Possible without a recursive php function?

Thank you.

+1  A: 

The most common patterns for storing hierarchical data in a relational database, is either adjacent list or modified preorder (aka nested set). An alternative is to use a materialised path, which is basically a caching mechanism, sitting on top of an adjacent list. See also this table for a comparison of pros and cons.

troelskn
+2  A: 

It is not possible to fetch an entire tree in one query using the Adjacency List design you're using, given that you're using MySQL.

Some other brands of database support SQL extensions to handle this kind of design. Oracle, Microsoft SQL Server, IBM DB2, and PostgreSQL 8.4 (currently in beta) support SQL extensions.

Other database designs exist that allow you to query trees more efficiently. This question has been addressed many times on StackOverflow, on blogs, and in articles.

You can also read "Trees and Hierarchies in SQL for Smarties" by Joe Celko, which goes into several such designs in depth.

Bill Karwin
Bill, I'd be interested if you find any flows in the function I was talking about in my answer. I know you have much more experience working with databases than I do.
Ionuț G. Stan
@Ionut: Okay, see my comment to your answer.
Bill Karwin
Bill, thanks for taking the time to give it a look. I really appreciate it.
Ionuț G. Stan
A: 

I don't know how much would this help you, but I wrote a little function that generates a hierarchical tree using a single MySQL query. Basically, all the important logic is moved into PHP. My solution uses the adjacency list model and then makes use of PHP references in order to build a tree data structure by means of a flat one. Take a look at the gist below and see if you get some inspiration. I'd help you more, but there are some problems that I have to deal with at my job.

http://gist.github.com/104357

Ionuț G. Stan
Okay, I read your code. How do you write a query that includes only nodes from the desired hierarchy? For example, suppose you're running Slashdot. You have millions of comments arranged in thousands of separate hierarchies, but stored in a single database table. How do you query only the comments belonging to the thread the user is currently viewing?
Bill Karwin
I suppose this is the weak point in my implementation. For large data sets it doesn't make to much sense to fetch the entire table data in memory. My solution kinda assumes that. At the time I wrote it was thought for tables with small amounts of data (not sure what small really means), where I believe is better to fetch all in one query than little chunks in multiple queries. Reaching for a certain "thread" will then be done in PHP code.
Ionuț G. Stan
Whether it's better to fetch all the data at once, or bits at a time, depends on how *much* data there is. :-) One thing you could do to mitigate this is to store the "root_id" with each tree node, in addition to the "parent_id". That way you could query for all rows with the same "root_id" and be sure the results contain only the tree you want. You still might get one huge tree that exceeds PHP memory, but at least you won't fetch data for unrelated trees as well.
Bill Karwin
A: 

This is possible to do in MySQL, but it takes a little effort. You'll have to write a function like this:

CREATE FUNCTION hierarchy_connect_by_parent_eq_prior_id(value INT) RETURNS INT
NOT DETERMINISTIC
READS SQL DATA
BEGIN
        DECLARE _id INT;
        DECLARE _parent INT;
        DECLARE _next INT;
        DECLARE CONTINUE HANDLER FOR NOT FOUND SET @id = NULL;

        SET _parent = @id;
        SET _id = -1;

        IF @id IS NULL THEN
                RETURN NULL;
        END IF;

        LOOP
                SELECT  MIN(id)
                INTO    @id
                FROM    categories
                WHERE   parent = _parent
                        AND id > _id;
                IF @id IS NOT NULL OR _parent = @start_with THEN
                        SET @level = @level + 1;
                        RETURN @id;
                END IF;
                SET @level := @level - 1;
                SELECT  id, parent
                INTO    _id, _parent
                FROM    categories
                WHERE   id = _parent;
        END LOOP;
END

and use it in a query:

SELECT  id, parent, level
FROM    (
        SELECT  hierarchy_connect_by_parent_eq_prior_id(id) AS id, @level AS level
        FROM    (
                SELECT  @start_with := 0,
                        @id := @start_with,
                        @level := 0
                ) vars, categories 
        WHERE   @id IS NOT NULL
        ) ho
JOIN    categories hi
ON      hi.id = ho.id

See this entry in my blog for more detail:

Quassnoi
This looks very interesting, thank you for the effort!I have another project with a similar situation inbound and I'll be sure to give this a try.
Dolester