This is the query that you asked for:
# By using LEFT JOINs you will be able to read any record,
# even one with missing parent/grand-parent...
SELECT
child.id,
child.name,
parent.id,
parent.name,
gparent.id,
gparent.name
FROM
some_table child
LEFT JOIN some_table parent ON
parent.id = child.parent_id
LEFT JOIN some_table gparent ON
gparent.id = child.grandparent_id
WHERE
child.id = 3
BUT I would also add that the redundancy of having a field grandparent_id does NOT sound right to me...
Your table should be just:
id name parent_id
1 Milton NULL
2 Year 3 1
3 Class A 2
Notice that, if I know that 1
is the parent of 2
, I don't need repeat that same information again on record 3
...
In this last case, your select could be like this:
SELECT
child.id,
child.name,
parent.id,
parent.name,
gparent.id,
gparent.name
FROM
some_table child
LEFT JOIN some_table parent ON
parent.id = child.parent_id
LEFT JOIN some_table gparent ON
gparent.id = parent.parent_id -- See the difference?
WHERE
child.id = 3
The query would work the same, and you would also have more "normalized" database.
Edit: This is pretty basic stuff, but I guess it is relevant to this answer...
This kind of denormalization (i.e. to have both parent_id
and grandparent_id
on the same record) should not be used because it allows the database to be inconsistent.
For instance, let's suppose that a new record is inserted:
id name parent_id grandparent_id
1 Milton NULL NULL
2 Year 3 1 NULL
3 Class A 2 1
4 Invalid Rec 2 3
It doesn't make any sense, right? Record 4
is stating that 3
is its grandparent. So, 3
should be the parent of record 2
. But that's not what is stated on record 3
itself. Which record is right?
You may think this is an odd error, and that your database will never become like this. But my experience says otherwise - if an error may happen, it will eventually. Denormalization should be avoided, not just because some database guru says so, but because it really increases inconsistencies, and makes maintenance harder.
Of course, denormalized databases may be faster. But, as a rule of thumb, you should think about performance after your system is ready for production, and after you have perceived, by the means of some automated or empirical test, that a bottleneck exists. Believe me, I have seen much worse design choices being justified by wrong performance expectations before...