views:

52

answers:

2

I'm using the ancestry gem to help organise my app's tree structure in the database. It basically writes a childs ancestor information to a special column called 'ancestry'. The ancestry column for a particular child might look like '1/34/87' where the parent of this child is 87, and then 87's parent is 34 and 34's is 1.

It seems possible that we could select rows from this table each with a subquery that checks all the ancestors to see if a certain attribute it set. E.g. in my app you can hide an item and its children just by setting the parent element's visibility column to 0.

I want to be able to find all the items where none of their ancestors are hidden. I tried converting the slashes to comma's with the REPLACE command but IN required a set of comma separated integers rather than one string with comma separated string numbers.

It's funny, because I can do this query in two steps, e.g. retrieve the row, then take its ancestry column, split out the id's and make another query that checks that the id is IN that set of id's and that visibility isn't ever 0 and whala! But joining these into one query seems to be quite a task. Much searching has shown a few answers but none really do what I want.

SELECT * FROM t1 WHERE id = 99;

99's ancestry column reads '1/34/87'

SELECT * FROM t1 WHERE visibility = 0 AND id IN (1,34,87);

kind of backwards, but if this returns no rows then the item is visible.

Has anyone come across this before and come up with a solution. I don't really want to go the stored procedure route. It's for a rails app.

+1  A: 

What you might want to do is create a Split function, if you do not already have one (Split a Delimited String in SQL ) and then use that as the IN selection.

There is also another way, but it might degrade performance on large tables. Something like

SELECT  *
FROM    Table t INNER JOIN
        Table tParents      
        ON  (   t.Path LIKE CAST(tParents.ID AS VARCHAR(20)) + '/%'
                OR  t.Path LIKE +'%/' + CAST(tParents.ID AS VARCHAR(20)) + '/%'
                OR  t.Path LIKE +'%/' + CAST(tParents.ID AS VARCHAR(20)))

WHERE   t.ID = 99
AND     tParents.Visible = 0
astander
Thanks for that. I'm trying to keep away from stored functions, and as that page says "This example is for illustrative purposes only, as this kind of thing is best left to procedural languages." I found that one before :D
Brendon Muir
@Brendon - if you accept the advice "...this kind of thing is best left to procedural languages" then do it on the application side. Second best is to do it in the stored function. In the given example it is not such a sin - you are not looping through all the records of a large result set updating them one by one unnecessarily.
Unreason
Thanks Unreason, I'll look into this. I'll accept this answer as it seems to be the best option out there at this point :)
Brendon Muir
A: 

If you insist on getting away from stored/procedural why not switch to nested sets from your materialized paths approach?

Otherwise, do the two queries from the application side (or use stored procedure as suggested by astander).

Good links for hierarchies in SQL here

EDIT: It seems that you are storing information about the state of the tree control in the database.

Assuming that this is really justified and that you need to store the visibility in the database you might investigate the following scenarios (these are ideas, not immediate solutions):

  1. open a cursor/recordset/whatever-row-based-approach-is-called-in-your-framework and pass that to the tree control so that the number of updates and fetches from the database are related to the actions on the tree (making branches visible, updating visibility, hiding branches, etc). In this case (depending on the framework) you don't have to preselect the proper elements (nor issue select statement every time user closes or opens a branch).

  2. update the visibility in the database for all children. it seems that you are updating visibility only on a node that is expanded/collapsed (if you need to preserve visibility of collapsed branches and leafs then you might have two fields; this is not elegant, but I would test this option, too)

  3. investigate nested sets again; with nested sets required queries might become faster. Also it becomes a bit easier to write SQL (here's SQL that returns nodes that have all parents visible, assuming visibility is tinyint(1); BIT_AND will do aggregate AND on all the parents in a single query)

    SELECT node.name AS name
    FROM t1 AS node,
    t1 AS parent
    WHERE node.visibility = 1 AND node.lft BETWEEN parent.lft AND parent.rgt
    GROUP BY node.name
    HAVING BIT_AND(parent.visibility) = 1
    ORDER BY node.lft

(this is tested, I took example from here, and added visibility)

Also, when you test and benchmark each solution do not forget to benchmark all the operations (selecting visible branches, opening hidden branches, marking node invisible, etc..).

Unreason
Lol, just tried to comment and lost the whole thing. I migrated from nested sets to materialised paths :D The reason this needs to be in pure SQL is that it's a Sphinx query used to populate the index. I'm wanting to eliminate non visible descendants. Thanks for the link, I'll take a good look at it :)
Brendon Muir
What's the problem in getting only visible elements in the first place? Is it that there could be a node that is marked invisible in the chain and you want to drop all the nodes which are below it?
Unreason
Hi Unreason, you're exactly right. I need to eliminate invisible items and their descendants even though the descendants aren't explicitly marked as invisible :)
Brendon Muir
I updated the answer, check out the nested sets SQL (same approach can be used with astander/materialized path structure with main difference that in that case you have to use 3 ORed LIKE expressions to select node-parent pairs, as opposed to two integer comparisons here; on the other hand nested sets require maintenance).
Unreason
Thanks Unreason, I'll take a good look at that. Our nested set implementation (using awesome_nested_set for rails) was getting corrupted all the time when two updates happened too close to each other. Despite all sorts of locking etc... I couldn't track down the problem so I decided to go to the materialised path approach which has worked very well so far.
Brendon Muir