views:

1131

answers:

3

I am currently in the process of rewriting an application whereby teachers can plan curriculum online.

The application guides teachers through a process of creating a unit of work for their students. The tool is currently used in three states but we have plans to get much bigger than that.

One of the major draw cards of the application is that all of the student outcomes are preloaded into the system. This allows teachers to search or browse through and select which outcomes are going to be met in each unit of work.

When I originally designed the system I made the assumption that all student outcomes followed a similar Hierarchy. That is, there are named nested containers and then outcomes.

The original set of outcomes that I entered was three tiered. As such my database has the following structure:

=========================

Tables in bold

h1

id, Name

h2

id, parent___id (h1_id), Name

h3

id, parent___id (h2_id), Name

outcome

id, parent___id (h3_id), Name

=========================

Other than the obvious inability to add n/ levels of hierarchy this method also made it difficult to display a list of all of the standards without recursively querying the database.

Once the student outcomes (and their parent categories) have been added there is very little reason for them to be modified in any way. The primary requirement is that they are easy and efficient to read.

So far all of the student outcomes from different schools / states / countries have roughly followed my assumption. This may not always be the case.

All existing data must of course be transferred across from the current database.

Given the above, what is the best way for me to store all the different sets of student outcomes? Some of the ideas I have had are listed below.

  • Continue using 4 tables in the database, when selecting either use recusion or lots of joins

  • Use nested sets

  • XML (Either a global XML file for all of the different sets or an XML file for each)

+6  A: 

I don't know that you actually need 4 tables for this.

If you have a single table that tracks the parent_id and a level you can have infinite levels.

outcome

id, parent_id, level, name

You can use recursion to track through the tree for any particular element (you don't actually need level, but it can be easier to query with it).

The alternative is nested sets. In this case you would still merge to a single table, but use the set stuff to track levels.

Which one to use depends on your application.

Read-intensive: nested sets

Write-intensive: parent tree thingy

This is because with nested sets you can retrieve the entire tree with a single query but at the cost of reordering the entire tree every time you insert a new node.

When you just track the parent_id, you can move or delete nodes individually.

PS: I vote no to XML. You have the same recursive issues, plus the overhead of parsing the data as well as either storing it in the db or on the filesystem (which will cause concurrency issues).

Toby Hede
Thanks for the advice, sometimes you just need a fresh opinion to point out the blatantly obvious.Cheers :)
Danny
+1  A: 

I agree with the other poster - nested sets is the way to go I think.

See here:

http://dev.mysql.com/tech-resources/articles/hierarchical-data.html

It explains the theory and compares it to what you are already using - which is a twist on adjacency really. It shows +/- of them all, and should help you reach a decision based on all of the subtleties of your project.

Another thing I've seen (in CakePHP's tree behaviour) is actually to use both at once. Sure its not great performance wise, but under this model, you insert/remove things just as you would with adjacency, and then there is a method to run to rebuild the left/right edge values to allow you to do the selects in a nested sets fashion. Result is you can insert/delete much more easily.

http://book.cakephp.org/view/91/Tree

benlumley
I have come across a CI class that allows for a combination Nested set and adjacency model similar to the cakephp version you have linked to which I may give a try.I will post back with the results.
Danny
A: 

hi,

there is another way to handle trees in a database that is maybe not as "smart" than nested sets and other patterns described here, but that is really efficient and easy :

instead of storing the level (or depth) of an item, you can store the full path in the tree, like this :

A
  B
  C
    D
  E

would be stored like this:

item  |  parent  |  path
----------------------------
A     |  NULL    |  A
B     |  A       |  A--B
C     |  A       |  A--C
D     |  C       |  A--C--D
E     |  A       |  A--E

then you can easyly get:

  • (pure SQL) all direct children of an item with a where parent = '' clause
  • (pure SQL) all direct and indirect children with a where path LIKE 'PARENT--%' clause
  • (PHP) the depth of the node (count(explode('--',$path))

those features are good enough in most situations, and quite performant, even with several sublevels, as long as you create the good indices (PK, index on parent, index on path). For sure, this solution is demanding when deleting/moving nodes to update pathes...

I hope this helps!

Gauthier