What is the best way to build the table that will represent the tree? I want to implement a select ,insert ,update and delete that will work well with big data. The select for example will have to support "Expand ALL" - getting all the children (and there children) for a given node.
Use CTE
's.
Given the tree-like table structure:
id parent name
1 0 Electronics
2 1 TV
3 1 Hi-Fi
4 2 LCD
5 2 Plasma
6 3 Amplifiers
7 3 Speakers
, this query will return id
, parent
and depth level, ordered as a tree:
WITH v (id, parent, level) AS
(
SELECT id, parent, 1
FROM table
WHERE parent = 0
UNION ALL
SELECT id, parent, v.level + 1
FROM v
JOIN table t
ON t.parent = v.id
)
SELECT *
FROM v
id parent name
1 0 Electronics
2 1 TV
4 2 LCD
5 2 Plasma
3 1 Hi-Fi
6 3 Amplifiers
7 3 Speakers
Replace parent = 0
with parent = @parent
to get only a branch of a tree.
Provided there's an index on table (parent)
, this query will efficiently work on a very large table, since it will recursively use INDEX LOOKUP
to find all chilrden for each parent.
To update a certain branch, issue:
WITH v (id, parent, level) AS
(
SELECT id, parent, 1
FROM table
WHERE parent = 0
UNION ALL
SELECT id, parent, v.level + 1
FROM v
JOIN table t
ON t.parent = v.id
)
UPDATE table t
SET column = newvalue
WHERE t.id IN
(
SELECT id
FROM v
)
where @parent
is the root of the branch.
Check out Joe Celko's book on trees and hierarchies for multiple ways to tackle the hierarchy problem. The model that you choose will depend on how you weight lookups vs. updates vs. complexity. You can make the lookups pretty fast (especially for getting all children in a node) using the adjacency list model, but updates to the tree are slower.
You have to ask yourself these questions first : 1) What is ratio of modifications vs reads ? (= mostly static tree or changing constantly?) 2) How deep and how large do you expect the tree to grow ?
Nested sets are great for mostly-static trees where you need operations on whole branches. It handles deep trees without problems.
Materialized path works well for dynamic (changing) trees with constrained/predictable depth.
Recursive CTEs are ideal for very small trees, but the branch operations ("get all children in this branch..") get very costly with deep / large tree.
If you have many updates and selects, the best option seems to be the Path Enumeration Model, which is briefly described here:
http://www.sqlteam.com/article/more-trees-hierarchies-in-sql