views:

181

answers:

3

I have a tree (nested categories) stored as follows:

CREATE TABLE `category` (
  `category_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `category_name` varchar(100) NOT NULL,
  `parent_id` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`category_id`),
  UNIQUE KEY `category_name_UNIQUE` (`category_name`,`parent_id`),
  KEY `fk_category_category1` (`parent_id`,`category_id`),
  CONSTRAINT `fk_category_category1` FOREIGN KEY (`parent_id`) REFERENCES `category` (`category_id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_spanish_ci

I need to feed my client-side language (PHP) with node information (child+parent) so it can build the tree in memory. I can tweak my PHP code but I think the operation would be way simpler if I could just retrieve the rows in such an order that all parents come before their children. I could do that if I knew the level for each node:

SELECT category_id, category_name, parent_id
FROM category
ORDER BY level -- No `level` column so far :(

Can you think of a way (view, stored routine or whatever...) to calculate the node level? I guess it's okay if it's not real-time and I need to recalculate it on node modification.

First update: progress so far

I've written these triggers based on feedback by Amarghosh:

DROP TRIGGER IF EXISTS `category_before_insert`;

DELIMITER //

CREATE TRIGGER `category_before_insert` BEFORE INSERT ON `category` FOR EACH ROW BEGIN
    IF NEW.parent_id IS NULL THEN
        SET @parent_level = 0;
    ELSE
        SELECT level INTO @parent_level
        FROM category
        WHERE category_id = NEW.parent_id;
    END IF;

    SET NEW.level = @parent_level+1;
END//

DELIMITER ;


DROP TRIGGER IF EXISTS `category_before_update`;

DELIMITER //

CREATE TRIGGER `category_before_update` BEFORE UPDATE ON `category` FOR EACH ROW BEGIN
    IF NEW.parent_id IS NULL THEN
        SET @parent_level = 0;
    ELSE
        SELECT level INTO @parent_level
        FROM category
        WHERE category_id = NEW.parent_id;
    END IF;

    SET NEW.level = @parent_level+1;
END//

DELIMITER ;

It seems to work fine for insertions and modifications. But it doesn't work for deletions: MySQL Server does not launch triggers when the rows are updated from ON UPDATE CASCADE foreign keys.

The first obvious idea is to write a new trigger for deletion; however, a trigger on table categories is not allowed to modify other rows on this same table:

DROP TRIGGER IF EXISTS `category_after_delete`;

DELIMITER //

CREATE TRIGGER `category_after_delete` AFTER DELETE ON `category` FOR EACH ROW BEGIN
    /*
     * Raises an error, see below
     */
    UPDATE category SET parent_id=NULL
    WHERE parent_id = OLD.category_id;
END//

DELIMITER ;

Error:

Grid editing error: SQL Error (1442): Can't update table 'category' in stored function/trigger because it is already used by statement which invoked this stored function/trigger.

Second update: working solution (unless proved wrong)

My first attempt was pretty sensible but I found a problem I could not manage to solve: when you launch a series of operations from a trigger, MySQL will not allow to alter other lines from the same table. Since node deletions require to adjust the level of all descendants, I had hit a wall.

In the end, I changed the approach using code from here: rather than correcting individual levels when a node change, I have code to calculate all levels and I trigger it on every edit. Since it's a slow calculation and fetching data requires a very complex query, I cache it into a table. In my case, it's an acceptable solution since editions should be rare.

1. New table for cached levels:

CREATE TABLE `category_level` (
  `category_id` int(10) NOT NULL,
  `parent_id` int(10) DEFAULT NULL, -- Not really necesary
  `level` int(10) NOT NULL,
  PRIMARY KEY (`category_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_spanish_ci

2. Helper function to calculate levels

If I really got a grasp on how it works, it doesn't really return anything useful by itself. Instead, it stores stuff in session variables.

CREATE FUNCTION `category_connect_by_parent_eq_prior_id`(`value` INT) RETURNS int(10)
    READS SQL DATA
BEGIN
    DECLARE _id INT;
    DECLARE _parent INT;
    DECLARE _next INT;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET @category_id = NULL;

    SET _parent = @category_id;
    SET _id = -1;

    IF @category_id IS NULL THEN
        RETURN NULL;
    END IF;

    LOOP
        SELECT  MIN(category_id)
        INTO    @category_id
        FROM    category
        WHERE   COALESCE(parent_id, 0) = _parent
            AND category_id > _id;
        IF @category_id IS NOT NULL OR _parent = @start_with THEN
            SET @level = @level + 1;
            RETURN @category_id;
        END IF;
        SET @level := @level - 1;
        SELECT  category_id, COALESCE(parent_id, 0)
        INTO    _id, _parent
        FROM    category
        WHERE   category_id = _parent;
    END LOOP;
END

3. Procedure to launch the recalculation process

It basically encapsulates the complex query that retrieves the levels aided by the helper function.

CREATE PROCEDURE `update_category_level`()
    SQL SECURITY INVOKER
BEGIN
    DELETE FROM category_level;

    INSERT INTO category_level (category_id, parent_id, level)
    SELECT hi.category_id, parent_id, level
    FROM (
        SELECT category_connect_by_parent_eq_prior_id(category_id) AS category_id, @level AS level
        FROM (
            SELECT  @start_with := 0,
                @category_id := @start_with,
                @level := 0
            ) vars, category
        WHERE @category_id IS NOT NULL
        ) ho
    JOIN category hi ON hi.category_id = ho.category_id;
END

4. Triggers to keep cache table up-to-date

CREATE TRIGGER `category_after_insert` AFTER INSERT ON `category` FOR EACH ROW BEGIN
    call update_category_level();
END

CREATE TRIGGER `category_after_update` AFTER UPDATE ON `category` FOR EACH ROW BEGIN
    call update_category_level();
END

CREATE TRIGGER `category_after_delete` AFTER DELETE ON `category` FOR EACH ROW BEGIN
    call update_category_level();
END

5. Known issues

  • It's pretty suboptimal if nodes are altered frequently.
  • MySQL does not allow transactions or table locking in triggers and procedures. You must take care of these details where you edit nodes.
+2  A: 

If there won't be any cycles (if it'd always be a tree and not a graph), you can have a level field that is set to zero (top most) by default and a stored procedure that updates the level to (parent's level + 1) whenever you update the parent_id.

CREATE TRIGGER setLevelBeforeInsert BEFORE INSERT ON category
FOR EACH ROW
BEGIN
IF NEW.parent_id IS NOT NULL THEN
SELECT level INTO @pLevel FROM category WHERE id = NEW.parent_id;
SET NEW.level = @pLevel + 1;
ELSE 
SET NEW.level = 0;
END IF;
END;
Amarghosh
Good strategy, although I'm curious to see the code of said stored procedure, and how to make it fire when the parent_id updates.
gnarf
@gnarf You can do it with triggers. Something like `CREATE TRIGGER AFTER INSERT/UPDATE ON category` and so on - I'm not sure about the syntax - I'll try and update in the evening once I get time to explore it (if no one comes with the code by then).
Amarghosh
I've been working on this line and I even tried to write a set of two triggers (ON INSERT and ON UPDATE) but the manual gives almost no hints about the syntax: my attempts so far won't even compile :(
Álvaro G. Vicario
@Álvaro I've updated with some code - see if it works.
Amarghosh
If it works, you can adopt this to `BEFORE UPDATE`. You should also take care of `AFTER DELETE` based on your business constraints - does children of deleted parent become orphaned or become children of its parent if any?
Amarghosh
Thanks for your ideas. Orphaned children become first level nodes by now. I've updated the question.
Álvaro G. Vicario
It seems you cannot update a table from a trigger originated while updating the same table (understandably, mysql puts a lock on the table). But \`ON UPDATE CASCADE\` will anyway set the \`parent_id\` to \`NULL\`. So if an entry has a valid level (say 3) but its \`parent_id\` is null, it means that it is in fact a top level entry. Some tweaking might be required in the code - but I think you can rely on this behavior. If parent is null, it is the root no matter what the value of level is. – [Amarghosh](http://stackoverflow.com/users/165297/amarghosh)
ccomet
Thanks @ccomet :)
Amarghosh
When a node is removed and its children become top level nodes, it's clear that their new level is always 1. But you still need to fix the level of all grandchildren. I'm beginning to question the whole approach; perhaps that's why most tutorials that use this data model use stored procedures to insert, move and remove nodes. I wouldn't mind having to run a stored procedure to update level info for all nodes but I can't figure out an algorithm I can write in MySQL's dialect.
Álvaro G. Vicario
+1  A: 

There's an excellent series of articles here on Hierarchical Queries in MySQL that includes how to identify level, leaf nodes, loops in the hierarchy, etc.

Mark Baker
I managed to get what so far looks like a working solution using code from the page you link. (It was all in the first page.) I'll try to update the question tomorrow to reflect my changes. It's not as elegant as other approaches but... :)
Álvaro G. Vicario
Glad to know you're finding it useful. I'm primarily an Oracle user and make extensive use of hierarchical queries there, and I've been looking at the possibilities of porting some of my Oracle work to MySQL... which is how I ran across those pages in the first place. I'll definitely be interested in knowing how it all works out.
Mark Baker
A: 

No level column so far :(

Hmm * shrug *
I'd just made this level field manually.
Say, like Materialized path, with just one update after insert and without all these fancy triggers.
A field which is going to be like 000000100000210000022 for the 3-rd level for example

so it can build the tree in memory.

if you going to get whole table into PHP, I see no problem here. A little recursive function can give you your nested arrays tree.

I can tweak my PHP code but I think the operation would be way simpler

well, well.
The code you've got so far doesn't seem to me "way simple" :)

Col. Shrapnel
«The code you've got so far doesn't seem to me "way simple"» — It is simpler, in the sense that I didn't have to write it myself :)
Álvaro G. Vicario
@Álvaro thus you understand nothing of it and can't debug or repair.
Col. Shrapnel
I regret to inform that already did that. The original code was not using NULL for root nodes.
Álvaro G. Vicario