views:

145

answers:

1

In a Django App I'm working on I've got this going on:

class Parent(models.Model):
    name = models.CharField(...)

    def num_children(self):
        return Children.objects.filter(parent=self).count()

    def avg_child_rating(self):
        return Child.objects.filter(parent=self).aggregate(Avg('rating'))

class Child(models.Model):
    name = models.CharField(...)
    parent = models.ForeignKey(Parent)
    rating = models.IntegerField(default=0)

I plan on accessing avg_child_rating often. Would it be optimizing if I did the following:

class Parent(models.Model):
    ...
    num_children = models.IntegerField(default=0)
    avg_child_rating = models.FloatField(default=0.0)

def update_parent_child_stats(sender, instance, **kwargs):
    num_children = Child.objects.filter(parent=instance.parent)
    if instance.parent.num_children != num_children:
        instance.parent.num_children = num_children
        instance.parent.avg_child_rating = Child.objects.filter(instance.parent=self).aggregate(Avg('rating'))

post_save.connect(update_parent_child_stats, sender=Child)
post_delete.connect(update_parent_child_stats, sender=Child)

The difference now is that every time a child is created/rated/deleted, the Parent object is updated. I know that the created/rating will be done often.

What's more expensive?

+3  A: 

Depends on the scale of the problem. If you anticipate a lot of write traffic, this might be an issue. It's much harder to scale writes than reads (replicate, caching etc.) That said, you can probably going a long way without this extra query causing you any problems.

Depending on how up-to-date your stats must be you could have some other process (non-web session) come through and update these stats nightly.

rhettg