views:

180

answers:

6

The idea is simple - I have two tables, categories and products.

Categories:

id | parent_id | name               | count
1    NULL        Literature           6020
2    1           Interesting books    1000
3    1           Horrible books       5000
4    1           Books to burn        20
5    NULL        Motorized vehicles   1000
6    5           Cars                 999
7    5           Motorbikes           1
...

Products:

id | category_id | name
1    1             Cooking for dummies
2    3             Twilight saga
3    5             My grandpa's car
...

Now while displayed, the parent category contains all the products of all the children categories. Any category may have children categories. The count field in the table structure contains (or at least I want it to contain) count of all products displayed in this particular category. On the front-end, I select all subcategories with a simple recursive function, however I'm not so sure how to do this in a SQL procedure (yes it has to be a SQL procedure).The tables contain about a hundread categories of any kind and there are over 100 000 products.
Any ideas?

+2  A: 

Take a look at this article on managing heirachical trees in MySQL.

It explains the disadvantages to your current method and some more optimal solutions.

See especially the section towards the ended headed 'Aggregate Functions in a Nested Set'.

David Caunt
This is really nice, but I can't change the metod of storing the data, and also, after a quick look through, it doesn't explain what to do with an unknown depth of a tree.
cypher
No need to worry about the depth with the ressource from above. This is handled within the way you describe the hierarchy.
DrColossos
+1 BTW to point to one of my favourite articles on hierarchy
DrColossos
It's a great article for sure!
David Caunt
Nested Sets _are_ nice for seldomly changing data, but as soon as you're often shifting nodes to other places they become a pain very quickly...
Wrikken
Not a problem in this case, but a very valid point. The article doesn't explain how to do it.
David Caunt
A: 

What you want is a common table expression. Unfortunately it looks like mysql doesn't support them.

Instead you will probably need to use a loop to keep selecting deeper trees.

I'll try whip up an example. To clarify, you're looking to be able to call the procedure with an input of say '1' and get back all the sub categories and subsub categories (and so on) with 1 as an eventual root? like

id parent
1  null
2  1
3  1
4  2

?

Edited:

This is what I came up with, it seems to work. Unfortunately I don't have mysql, so I had to use sql server. I tried to check everythign to make sure it will work with mysql but there may still be issues.

declare @input int
set @input = 1

--not needed, but informative
declare @depth int
set @depth = 0

--for breaking out of the loop
declare @break int
set @break = 0

--my table '[recursive]' is pretty simple, the results table matches it
declare @results table
(
    id int,
    parent int,
    depth int
)

--Seed the results table with the root node
insert into @results
select id, parent, @depth from [recursive]
where ID = @input

--Loop through, adding notes as we go
set @break = 1
while (@break > 0)
begin
    set @depth=@depth+1 --Increase the depth counter each loop

    --This checks to see how many rows we are about to add to the table.
    --If we don't add any rows, we can stop looping
    select @break = count(id) from [recursive]
    where parent in 
    (
        select id from @results
    )
    and id not in --Don't add rows that are already in the results
    (
        select id from @results
    )

    --Here we add the rows to the results table
    insert into @results
    select id, parent, @depth from [recursive]
    where parent in
    (
        select id from @results
    )
    and id not in --Don't add rows that are already in the results
    (
        select id from @results
    )
end

--Select the results and return
select * from @results
david
Yes, that seems about right, unfortunately I don't know what tree depth can I expect. It can be three, it can be three hundread...
cypher
Edited with a loop that might work
david
+4  A: 

Bill Karwin made some nice slides about hierachical data, and the current Adjacency Model certainly as pros, but it's not very suited for this (getting a whole subtree).

For my Adjacency tables, I solve it by storing / caching the path (possibly in a script, or in a 'before update trigger'), on change of parent_id id, a new path-string is created. Your current table would look like this:

id | parent_id | path    | name               | count
1    NULL        1         Literature           6020
2    1           1:2       Interesting books    1000
3    1           1:3       Horrible books       5000
4    1           1:4       Books to burn        20
5    NULL        5         Motorized vehicles   1000
6    5           5:6       Cars                 999
7    5           5:7       Motorbikes           1

(choose any delimiter not found in the id you like)

So, now to get all products from a category + subcategories:

SELECT p.*
FROM categories c_main
JOIN categories c_subs
ON c_subs.id = c_main.id 
   OR c_subs.path LIKE CONCAT(c_main,':%')
JOIN products p
ON p.category_id = c_subs.id
WHERE c_main.id = <id>
Wrikken
This seems quite reasonable, unfortunately none of these answers answered my original question, so it's been awarded +50. Thank you all.
cypher
Wrikken
+2  A: 

There's a whole chapter in "SQL Antipatterns Avoiding the Pitfalls of Database Programming" by Bill Karwin about managing hierachical data in SQL.

alt text

chris
It's a shame that MySQL doesn't support recursive queries
Night Shade
A: 

Try to get rid of the hierarchy that is implemented that way. Recursion in stored procedures aren't nice, and for example, on MS SQL they fail after 64th level.

Also, to get for example everything from some category and it's subcategories, you will have to recursively go all the way down, which is impractical for SQL - nevertheless to say slow.

Instead, use this; create category_path field, and make it look like:

category_path name
1/            literature
1/2/          Interesting books
1/3/          Horrible books
1/4/          Books to burn
5/            Motorized vehicles
5/6/          Cars
5/7/          Motorbikes

By using that method, you will be able to SELECT categories and subcategories very fast. Updates will be slow, but I guess that they CAN be slow. Also, you can keep your old child-parent relationship fields, to help you maintain your tree structure.

For example, getting all cars, without any recursion, will be:

SELECT * FROM ttt WHERE category_path LIKE '5/%'
Daniel Mošmondor
A: 

As you havent accepted an answer yet i thought i'd post my method for handling trees in mysql and php. (single db call to non recursive sproc)

Full script here : http://pastie.org/1252426 or see below...

Hope this helps :)

PHP

<?php
$conn = new mysqli("localhost", "foo_dbo", "pass", "foo_db", 3306);

$result = $conn->query(sprintf("call product_hier(%d)", 3));

echo "<table border='1'>
        <tr><th>prod_id</th><th>prod_name</th><th>parent_prod_id</th>
        <th>parent_prod_name</th><th>depth</th></tr>";

while($row = $result->fetch_assoc()){
    echo sprintf("<tr><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>", 
        $row["prod_id"],$row["prod_name"],$row["parent_prod_id"],
        $row["parent_prod_name"],$row["depth"]);
}
echo "</table>";
$result->close();
$conn->close();
?>

SQL

drop table if exists product;

create table product
(
prod_id smallint unsigned not null auto_increment primary key,
name varchar(255) not null,
parent_id smallint unsigned null,
key (parent_id)
)engine = innodb;


insert into product (name, parent_id) values
('Products',null), 
   ('Systems & Bundles',1), 
   ('Components',1), 
      ('Processors',3), 
      ('Motherboards',3), 
        ('AMD',5), 
        ('Intel',5), 
           ('Intel LGA1366',7);


delimiter ;

drop procedure if exists product_hier;

delimiter #

create procedure product_hier
(
in p_prod_id smallint unsigned
)
begin

declare v_done tinyint unsigned default 0;
declare v_depth smallint unsigned default 0;

create temporary table hier(
 parent_id smallint unsigned, 
 prod_id smallint unsigned, 
 depth smallint unsigned default 0
)engine = memory;

insert into hier select parent_id, prod_id, v_depth from product where prod_id = p_prod_id;

/* http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */

create temporary table tmp engine=memory select * from hier;

while not v_done do

    if exists( select 1 from product p inner join hier on p.parent_id = hier.prod_id and hier.depth = v_depth) then

        insert into hier 
            select p.parent_id, p.prod_id,  v_depth + 1 from product p 
            inner join tmp on p.parent_id = tmp.prod_id and tmp.depth = v_depth;

        set v_depth = v_depth + 1;          

        truncate table tmp;
        insert into tmp select * from hier where depth = v_depth;

    else
        set v_done = 1;
    end if;

end while;

select 
 p.prod_id,
 p.name as prod_name,
 b.prod_id as parent_prod_id,
 b.name as parent_prod_name,
 hier.depth
from 
 hier
inner join product p on hier.prod_id = p.prod_id
inner join product b on hier.parent_id = b.prod_id
order by
 hier.depth, hier.prod_id;

drop temporary table if exists hier;
drop temporary table if exists tmp;

end #

delimiter ;

call product_hier(3);

call product_hier(5);
f00