views:

2716

answers:

3

I need to store a tree data structure in my database, for which I plan on using django-treebeard or possibly django-mptt. My source of confusion is that each node could be one of three different possible types: root nodes will always be a type A entity, leaf nodes a type C entity, and anything in between will be a type B entity. I would like to know the best way to model this situation.

update: I first tried model inheritance, and I think that this could be the best way to go. Unfortunately django-treebeard's public API isn't really designed to handle this. I ended up getting it to work with GenericForeignKey. Thank you very much for the answers.

+2  A: 

Your three types are probably easiest handled as FK associations with the fundamental tree.

The tree can be homogenous -- class MyNode is a direct subclass of treebeard.Node. Your node can have a flag (Root, Middle, Leaf) and FK's for A or B or C. This allows you some SQL-like flexibility in querying MyNode instance.

This allows your tree to grow. A node can start as a type C (leaf) and then morph to a type B (intermediate). You change the status, and change the FK's.

The alternative is a bit more complex.

class MyA( treebeard.Node ):
    pass

class MyB( treebeard.Node ):
    pass

class MyC( treebeard.Node ):
    pass

In this case, you can't "morph" a node. When a node starts as a MyC, and gets children, you have to remove the original MyC instance, and replace it with a MyB version that has a new node as a child. This isn't impossible, but it can be painful.

S.Lott
+2  A: 

How about using a generic relation from the model which will hold the tree structure to the content object for the node it represents?

from django.db import models
from django.contrib.contenttypes.models import ContentType
from django.contrib.contenttypes import generic

class Node(models.Model):
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    object = generic.GenericForeignKey('content_type', 'object_id')

This could potentially result in a lot of queries when retrieving content objects for the full tree, but there are ways and means of reducing the number of queries required.

# Assuming mptt, as I'm not familiar with treebeard's API

# 1 query to retrieve the tree
tree = list(Node.tree.all())

# 4 queries to retrieve and cache all ContentType, A, B and C instances, respectively
populate_content_object_caches(tree)
insin
A: 

Well, a lot is already done for you, in a way, because roots, leaves and others are already inherently identified by the tree API. You can call is_root() and is_leaf() on individual nodes to distinguish them.

Leaves and in-betweens can be the same type of entity and hold the same type of data, with the way the data is interpreted and used by the application depending on testing is_leaf().

Roots are somewhat special... they might want to hold information pertinent to the whole tree, and you might like a simple way to look up particular roots and hold extra data. You might do this with a model that has a one-to-one relationship with a root node (perhaps with the save method overloaded and checking to confirm that the node it points to is_root() before allowing the save).

My point overall is that you may not need to get very fancy to do what you want. The distinction you're making is already encapsulated in the concept of the tree and its API and you could probably implement different behavior with the same basic data by checking the context of the node.

Stephen DeGrace