views:

178

answers:

3

Lets say I have model inheritance set up in the way defined below.

class ArticleBase(models.Model):
    title = models.CharField()
    author = models.CharField()

class Review(ArticleBase):
    rating = models.IntegerField()

class News(ArticleBase):
    source = models.CharField()

If I need a list of all articles regardless of type (in this case both Reviews and News) ordered by title, I can run a query on ArticleBase. Is there an easy way once I have an ArticleBase record to determine if it relates to a Review or a News record without querying both models to see which has the foreign key of the record I am on?

A: 

You don't ever need to know the subclass of an ArticleBase. The two subclasses -- if you design this properly -- have the same methods and can be used interchangeably.

Currently, your two subclasses aren't trivially polymorphic. They each have unique attributes.

For the 80% case, they have nothing to do with each other, and this works out fine.

The phrase "all articles regardless of type" is a misleading way to describe what you're doing. It makes it sound simpler than it really is.

What you're asking for a is a "union of News and Reviews". I'm not sure what you'd do with this union -- index them perhaps. For the index page to work simply, you have to delegate the formatting details to each subclass. They have to emit an HTML-friendly summary from a method that both subclasses implement. This summary is what your template would rely on.

class Review(ArticleBase):
    rating = models.IntegerField()
    def summary( self ):
        return '<span class="title">%s</span><span class="author">%s</span><span class="rating">%s</span>' %  ( self.title, self.author, self.rating )

A similar summary method is required for News.

Then you can union these two disjoint types together to produce a single index.

S.Lott
There is added complication when you want to order by date, which now has to be performed in memory instead of the database. But my example is simplified actually I am dealing with blogs, news, reviews, buyer's guides, and galleries. I need to just have a list of the latest objects, with their titles, urls, and authors.
Jason Christa
@Jason Christa: Correct. However, this use case (union everything) is often rare. If the union is not a rare use case, you've got the wrong model: you need to have a single table with lots of nullable fields if the union is a common need.
S.Lott
A: 

A less sophisticated yet practical approach might be to have a single model with an article_type field. It's a lot easier to work with all articles and you're just a simple filter clause away from your news and review query sets.

I'd be interested to hear the arguments against this or the scenarios in which it might prove to be problematic. I've used this and inheritance at various times and haven't reached a clear conclusion on the best technique.

andybak
Cons against a single table are wasted space, harder to add new article types later, harder to enforce required fields that don't apply to all types, less efficient when only querying against a single type and changing your model to accommodate the ORM. To solve this problem in SQL you just do UNION (SELECT...), (SELECT...), (SELECT...) ORDER BY date;
Jason Christa
But without UNION support in the Django ORM you must either use custom SQL or use Python to join (and maybe filter) querysets. Depending on your application this could be more costly than the disadvantages you list above. I think it is a tricky case and may depend on the specific requirements - do you need to treat the two models the same more often than you need to treat them differently? And do you need to optimize for performance, RAM, disk space or readability?
andybak
+1  A: 

I am assuming that all ArticleBase instances are instances of ArticleBase subclasses.

One solution is to store the subclass name in ArticleBase and some methods that return the subclass or subclass object based on that information. As multi-table inheritance defines a property on the parent instance to access a child instance, this is all pretty straight forward.

from django.db import models

class ArticleBase(models.Model):
    title = models.CharField()
    author = models.CharField()
    # Store the actual class name.
    class_name = models.CharField()

    # Define save to make sure class_name is set.
    def save(self, *args, **kwargs):
        self.class_name = self.__class__.__name__
        super(ArticleBase, self).save(*args, **kwargs)

    # Multi-table inheritance defines an attribute to fetch the child
    # from a parent instance given the lower case subclass name.
    def get_child(self):
        return getattr(self, self.class_name.lower())

    # If indeed you really need the class.
    def get_child_class(self):
        return self.get_child().__class__

    # Check the type against a subclass name or a subclass.
    # For instance, 'if article.child_is(News):'
    # or 'if article.child_is("News"):'.
    def child_is(self, cls):
        if isinstance(cls, basestring):
            return cls.lower() == self.class_name.lower()
        else:
            return self.get_child_class()  == cls

class Review(ArticleBase):
    rating = models.IntegerField()

class News(ArticleBase):
    source = models.CharField()

This is by no means the only way to go about this. It is, however, a pretty simple and straight forward solution. The excellent contrib contenttypes app and the generic module which leverages it offer a wealth of ways to do this, well, generically.

It could be useful to have the following in ArticleBase:

def __unicode__(self)
    return self.get_child().__unicode__()

In that case, be aware that failure to define __unicode__ in the subclasses, or calling __unicode__ on an instance of ArticleBase (one that has not been subclassed) would lead to an infinite recursion. Thus the admonition below re sanity checking (for instance, preventing just such an instantiation of ArticleBase directly).

Disclaimer:

This code is untested, I'm sure I've got a typo or two in there, but the basic concept should be sound. Production level code should probably have some sanity checking to intercept usage errors.

AmanKow