views:

200

answers:

2

I am looking at quite a monstrosity of an application that uses several tables to represent a reporting hierarchy, but eacvh of these tables is identical. The table at the bottom of the hierarchy has the most records, each of which has a ParentID to the rollup table above it, eventually all adding up to only one total in the top rollup table.

I am plagued to insanity by gargantuan 'if' blocks of code with hard-coded joins and table names, and I am trying hard to recognise some sane reason for not using a single table, with a levelID in each row, instead of one table for each level, for all these levels, or at least several views on the same table. The latter because the database was designed to be used in MSAccess, which doesn't allow aliased sub-queries AFAIK.

+3  A: 

Are the tables all the same in their semantics or just in their form? If the tables are identical in both form and semantics, then a single table solution is probably far superior, but I don't know enough about your case to say for sure.

Representing a hierarchy in an SQL table can be quite a challenge. Fortunately, reporting hierarchies are generally small enough and stable enough so that a variety of techniques will work.

The simplest technique goes by the name "adjacency list" model. In this model, there are two columns one of which refers to the other. I'll call them MyTable(ID, ParentID). In a real case, Mytable will have other columns. ParentID references ID in a different row of the same table. This is easy to implement, and easy to update. It can be a pain to do rollups.

Another technique goes by the name "nested sets". Call it MyTable (ID, lft, rgt, Level). (Level is redundant, and often omitted). Here we have two columns (lft and rgt) that show where the row fits into the hierarchy, because lgt and rgt are nested inside the lft and rgt of all the ancestors of the node in question. This technique is hard to update. It's easy to do rollups, and to find subtrees, and ancestor paths, and lots of other types of queries.

A third way to "flatten the hierarchy". In this technique, each level of the hierarchy has a named column of its own, and each row displays its entire ancestry all the way back to the apex of the hierarchy. Here we have MyTable (ID, Division, Department, Group, Team). Division, Department, Group and Team are all levels of the hierarchy. This is ultimately easy for users who access the data via a point and click drill down interface, because there's nothing for them to learn, if the column names are chosen well. It requires a name for each level. It does not adapt well to indefinite levels of hierarchy. It's got a lot of redundancy. In general flattened hiearachies are generated automatically from a table that stores hierarchies in adajency list form or nested set form.

Any one of these are a good alternative to separate tables for every level in the hierarchy.

Walter Mitty
+1 good coverage of the hierarchy problem in relational databases. I also usually point people to Joe Celko's book on the subject for much more in depth coverage of the topic. Just do an Amazon search on "Joe Celko hierarchy".
Tom H.
A: 

excellent solutions are suggested here already. I would add one to use a hierarchical database such as hypergraph.

shajoshi