I'm considering using PostgreSQL's Ltree module in my application to help with threaded comments. I've been eying it for a while to use for threaded comments. I figure it would help with cases where you need to update a node and its children, like when you want to hide a comment and its replies.
I'm thinking ltree (or something like it) it would be useful if it was coupled with a traditional adjacency list ("comment_id"/"parent_comment_id").
Before taking the plunge into using ltree, I'm wondering a few things:
- Are you, or have you, used ltree? Is it what one might call "production ready"?
- If so, what problems did you use it to solve? Did it do a good job?
- Do you think it is a good fit for a
threaded comment system?
- If you used it, what did you use for the "text" part of the path? Did you set up something like the DMOZ example they use "Top.Astronomy.Cosmology" or base it on something like the primary key "1.403.29.5"?
- Is there a better way to do this? I'm a bit nervous using a nested list approach--everything I've read suggests that it isn't all to hot with UPDATES or INSERTS (don't you have to reorder the whole thing?). I'm also not a CS major and that kind of data structure is something I might forget in the future. Is anybody using nested lists for comments or something like it?
If it is of any help, here is the schema I'm considering:
CREATE TABLE comments (
comment_id SERIAL PRIMARY KEY,
parent_comment_id int REFERENCES comments(comment_id) ON UPDATE CASCADE ON DELETE CASCADE,
thread_id int NOT NULL REFERENCES threads(thread_id) ON UPDATE CASCADE ON DELETE CASCADE,
path ltree NOT NULL,
comment_body text NOT NULL,
hide boolean not null default false
);
The "path" column, used by ltree, would look something like:
<thread_id>.<parent_comment_id_#1>.<parent_comment_id_#2>.<my_comment_id>
Is there anything wrong with using the primary keys in the path? Should I be including the node's own primary key in the path? If I did, would it make sense to put a unique index on it to serve as a constraint?