views:

408

answers:

2

Hi, I'm building a food logging database in Django and I've got a query related problem.

I've set up my models to include (among other things) a Food model connected to the User model through an M2M-field "consumer" via the Consumption model. The Food model describes food dishes and the Consumption model describes a user's consumption of Food (date, amount, etc).

class Food(models.Model):
    food_name = models.CharField(max_length=30)
    consumer = models.ManyToManyField("User", through=Consumption)

class Consumption(models.Model):
    food = models.ForeignKey("Food")
    user = models.ForeignKey("User")

I want to create a query that returns all Food objects ordered by the number of times that Food object appears in the Consumption table for that user (the number of times the user has consumed the food).

I'm trying something in the line of:

Food.objects.all().annotate(consumption_times = Count(consumer)).order_by('consumption_times')`

But this will of course count all Consumption objects related to the Food object, not just the ones associated with the user. Do I need to change my models or am I just missing something obvious in the queries?

This is a pretty time-critical operation (among other things, it's used to fill an Autocomplete field in the Frontend) and the Food table has a couple of thousand entries, so I'd rather do the sorting in the database end, rather than doing the brute force method and iterate over the results doing:

Consumption.objects.filter(food=food, user=user).count()

and then using python sort to sort them. I don't think that method would scale very well as the user base increases and I want to design the database as future proof as I can from the start.

Any ideas?

+4  A: 

Perhaps something like this?

Food.objects.filter(consumer__user=user)\
            .annotate(consumption_times=Count('consumer'))\
            .order_by('consumption_times')
SmileyChris
But this would only return the Food objects that have been consumed at some time, wouldn't it? I want to return all Food objects, but in order of most oftenly consumed first.If I filter by user, i won't get the Food that hasn't been consumed yet. One idea would perhaps be to do two queries, first one like you suggested to get all the Food items consumed at least one time and then something along the lines of Food.objects.exclude(consumer__user = user) and fill out the list with those. Would that work?
Jens Alm
Yeah, 2 queries would be how I'd do it.
SmileyChris
+3  A: 

I am having a very similar issue. Basically, I know that the SQL query you want is:

SELECT food.*, COUNT(IF(consumption.user_id=123,TRUE,NULL)) AS consumption_times
       FROM food LEFT JOIN consumption ON (food.id=consumption.food_id)
       ORDER BY consumption_times;

What I wish is that you could mix aggregate functions and F expression, annotate F expressions without an aggregate function, have a richer set of operations/functions for F expressions, and have virtual fields that are basically an automatic F expression annotation. So that you could do:

Food.objects.annotate(consumption_times=Count(If(F('consumer')==user,True,None)))\
            .order_by('consumtion_times')

Also, just being able more easily able to add your own complex aggregate functions would be nice, but in the meantime, here's a hack that adds an aggregate function to do this.

from django.db.models import aggregates,sql
class CountIf(sql.aggregates.Count):
    sql_template = '%(function)s(IF(%(field)s=%(equals)s,TRUE,NULL))'
sql.aggregates.CountIf = CountIf

consumption_times = aggregates.Count('consumer',equals=user.id)
consumption_times.name = 'CountIf'
rows = Food.objects.annotate(consumption_times=consumption_times)\
                   .order_by('consumption_times')
sparkyb
This is awesome!!! Thanks man you saved my day! I'll try to make it look a bit nicer, but you definitely should file this into django's trac.
Grégoire Cachet