ansaurus

Question

Retrieving parent value from mysql self join

Answer 1

+1 A:

This is the query that you asked for:

# By using LEFT JOINs you will be able to read any record,
# even one with missing parent/grand-parent...
SELECT
    child.id,
    child.name,
    parent.id,
    parent.name,
    gparent.id,
    gparent.name
FROM
    some_table child
    LEFT JOIN some_table parent ON
        parent.id = child.parent_id
    LEFT JOIN some_table gparent ON
        gparent.id = child.grandparent_id
WHERE
    child.id = 3

BUT I would also add that the redundancy of having a field grandparent_id does NOT sound right to me...

Your table should be just:

id     name       parent_id
 1     Milton     NULL
 2     Year 3     1
 3     Class A    2

Notice that, if I know that 1 is the parent of 2, I don't need repeat that same information again on record 3...

In this last case, your select could be like this:

SELECT
    child.id,
    child.name,
    parent.id,
    parent.name,
    gparent.id,
    gparent.name
FROM
    some_table child
    LEFT JOIN some_table parent ON
        parent.id = child.parent_id
    LEFT JOIN some_table gparent ON
        gparent.id = parent.parent_id  -- See the difference?
WHERE
    child.id = 3

The query would work the same, and you would also have more "normalized" database.

Edit: This is pretty basic stuff, but I guess it is relevant to this answer...

This kind of denormalization (i.e. to have both parent_id and grandparent_id on the same record) should not be used because it allows the database to be inconsistent.

For instance, let's suppose that a new record is inserted:

id     name            parent_id   grandparent_id
 1     Milton          NULL        NULL
 2     Year 3          1           NULL
 3     Class A         2           1
 4     Invalid Rec     2           3

It doesn't make any sense, right? Record 4 is stating that 3 is its grandparent. So, 3 should be the parent of record 2. But that's not what is stated on record 3 itself. Which record is right?

You may think this is an odd error, and that your database will never become like this. But my experience says otherwise - if an error may happen, it will eventually. Denormalization should be avoided, not just because some database guru says so, but because it really increases inconsistencies, and makes maintenance harder.

Of course, denormalized databases may be faster. But, as a rule of thumb, you should think about performance after your system is ready for production, and after you have perceived, by the means of some automated or empirical test, that a bottleneck exists. Believe me, I have seen much worse design choices being justified by wrong performance expectations before...

rsenna 2010-10-01 22:56:16

I tend to agree. The denormalization can save a join though as in my answer.

Martin Smith 2010-10-01 23:03:46

The reason for the parent and grandparent field is because reading on the net, people have said that recursion (just parent_id) is not very efficient, and the most efficient way (nested sets) is complex. So this "lineage" type structure is a happy medium. In my case I know that there will only be a max of 2 levels of hierarchy.

Dan Spencer 2010-10-02 07:57:58

Also, with just parent_id, is it easy to select downwards on the hierarchy? e.g. if I want to get all of the child/grandchild records of record 1, how could this be achieved in the "normalized" model?

Dan Spencer 2010-10-02 08:15:38

@Dan Spencer: uh... in exactly the same way? OK, there is a gotcha in your question - record 1 does NOT have a parent. In that case, you should change the INNER JOIN for a LEFT JOIN (or a LEFT OUTER JOIN, like Martin did).Again, please understand that denormalized databases are common, but IMO only are justified when performance is an issue. And many times this is a high price to pay for - because denormalization allows errors, and when a error may happen, it will happen. I will edit my answer and explain it better.

rsenna 2010-10-03 14:14:18

@rsenna Thanks for your comments on this... I have now changed db structure as you suggested, only problem now is that mysql returns an empty result if there is no grandchild group - not all of the time I will have parent, child and grandchild. sometimes it will just be parent, child groups or even just parent. Is there a way i can get mysql to fail gracefully and just return to whatever level it can find? Thanks once again for all of your help.

Dan Spencer 2010-10-04 13:52:21

@dan If I understood it right, you're just missing that `LEFT JOIN` I spoke before. I'm gonna update the example.

rsenna 2010-10-05 16:53:21

Answer 2

+1 A:

SELECT t1.name, 
    MAX(CASE WHEN t2.id = t1.parent_id then t2.name end) as Parent,
    MAX(CASE WHEN t2.id = t1.grandparent_id then t2.name end) as GrandParent
FROM your_table t1
LEFT OUTER JOIN your_table t2 ON t2.id IN (t1.parent_id, t1.grandparent_id)
WHERE t1.id = 3
group by t1.id,  t1.name

Martin Smith 2010-10-01 22:56:45

ansaurus

tags:

views:

answers:

Retrieving parent value from mysql self join

related questions