views:

910

answers:

5

I'm using Django to write a blog app, and I'm trying to implement a hierarchical category structure. Each category has a "parent" ForeignKey pointing back to the same Category model. I want to allow admins to add categories, and I want the interface to allow them to select a category's parent category. However, I want to avoid I'm-my-own-grandpa situations, so I want to limit the available choices of categories to those which do not have category in question as an ancestor.

Right now, I'm controlling this from the view:

parent_candidates = list(Category.objects.all())
pruned_parent_list = [cat for cat in parent_candidates if instance.id not in cat.getHierarchy()]

where instance is the category being edited and getHierarchy() is a method to get a list of ancestor ids.

There are a number of problems with this approach. In particular, it uses an extra database hit to get the list of all categories and it makes me write the selection mechanism into my template by looping through pruned_parent_list to get the options, when I'd really rather just specify a widget.

Is there any better way to do this? I know I can add custom validation on the back-end to prevent this, but why give users the option?

A: 

"Is there any better way to do this?" Not really. Hierarchies are hard in the relational model. Nothing makes the easier except giving up on SQL entirely.

"write the selection mechanism into my template by looping through pruned_parent_list to get the options" -- probably not optimal. This should happen in your view.

S.Lott
A: 

If I understand your predicament correctly, the problem itself lies with the way you're dealing with which categories can be parents and which ones can't. One option to avoid these problems is to actually limit the level of categories which can become parents. For example, let's say you have the following categories:

  • Internet
    • Google
    • Yahoo
  • Offline
    • MS Office
    • OpenOffice

The way I usually handle this is I obviously have a parent_id FK on the categories table. For the root elements (Internet, Offline), the parent_id would be 0. So, when in your view you're trying to retrieve the "parent categories" for the dropdown, you need to decide how far down can they keep nesting. I mostly limit this to the first level, so to choose which categories to show in your dropdown, you'd do something like:

parents = Category.objects.filter(parent_id=0)

Now obviously, this limits the approach somewhat, but you can increase the level you'd like to include and work out some kind of visual identification system in your template for the dropdown (including extra spaces or dashes for each level in the hierarchy or something).

Anyway, sorry about the long response, and hopefully this addressed your issue somewhat.

Dave
Exactly. Treat categories as "Branch Categories" and "Leaf Categories".
Soviut
A: 

I am not sure if this is better (interaction-wise or otherwise) but...

You can check hierarchical integrity on save and raise an error if necessary.

Ideally for such a data type I would like to see a tree of instances on the side. Or at least the full ancestory on the object detail view. In both cases you'd already have done the extra trip mentioned to the database.

muhuk
"I know I can add custom validation on the back-end to prevent this, but why give users the option?" I think the point was to avoid checking on save and raising an exception.
S.Lott
+1  A: 

I have had to deal with arbitrary-depth categories on SQL and it seems not well suited for storing data of this type in a normal form, as nested queries and/or multiple JOINs tend to get ugly extremely quickly.

This is almost the only case where I would go with a sort of improper solution, namely to store categories in string form, subcategories separated by a delimiter. It makes both database queries and other operations much more trivial.

Categories table would look something like this:

id    name
1     Internet
2     Internet/Google
3     Internet/Yahoo
4     Offline
5     Offline/MS Office/MS Excel
6     Offline/Openoffice

Another solution is, that depending on your expected usage, you can maybe implement binary tree in the category list. That allows to select category trees and parent/child relationships elegantly. It has the limitation, however, that upon inserting new categories the whole tree may have to be recalculated, and that knowing the approximate size of tree in advance is useful.

At any rate, hierarchical data in SQL is not trivial by itself, so, whatever you do, you will probably have to do quite some custom coding.

Gnudiff
+1  A: 

Take a look at the django-treebeard app.

akaihola