views:

61

answers:

2

I have a set of data that's organized hierarchically that should be able to grow to an arbitrary size. I need to retrieve the entire tree, but I can't figure out how to do it with just SQL. My current solution is to create a temporary table and use a recursive function to successively query branches of the tree and then store the result in the temporary table which I subsequently query again to produce my desired result.

My question is, what I'm doing is essentially what a join does correct? Constructing an intermediate table and then querying over the results. It seems like there should be a way to do it with joins, but the MySQL documentation only covers retrieving parts of a tree up to finite depth. Is there a way to do this? I'm doing this in PHP.

+5  A: 

MySQL doesn't support recursive queries.

I would suggest that you look at Bill Karwin's presentation where he compares four different models for storing heirarchical data and looks at their pros and cons:

  • Adjacency list
  • Path enumeration
  • Nested sets
  • Closure table

Slide 48 shows the relative difficulty of certain types of queries with each of the models. From your question it sounds like you are interested most in "Query subtree", for which the adjacency list (the model you are currently using) performs the most poorly of the four.

Alternatively if you just want to select the entire tree, as in all data in the table, then you could use the simple query SELECT * FROM yourtable and reconstruct the tree structure in the client.

Mark Byers
Karwin's *SQL Antipatterns* book is a good read about these sort of things.
Daniel Vandersluis
Thanks, that was a great read. Nested sets ends up being something of an ideal solution, although it complicates the SQL somewhat.
Josh
Also, I had thought about grabbing the whole table, but I need to be able to grab specific branches as well.
Josh
+1  A: 

need more data.. does the table only represent a single tree, or multiple trees? If it's a single tree, you can just select everything from the table, then build the tree structure in memory. If it is multiple trees, you could consider adding a treeID to each tree element to represetn the tree the element belongs to.

If you are looking to select a branch of a tree, you can consider storing elements with sequential integer "sort ranks" and links to the left and right nodes, then selecting all nodes within the integer range of the left most node and right most node.

Look up adjacency list for more information on this storage model. You can also create a hybrid adjacency list/ parent node link, since data storage is so cheap, but you may have more overhead keeping sort links updated...

Zak