views:

60

answers:

3

I'm trying to achieve some extra select on a queryset and wants to add the needed table to the pool of tables in the query using the select_related method in order to benefit for the '__' syntax.

Here is an example with simple models :

from django.db import models

# Create your models here.

class testA(models.Model):
    code = models.TextField(unique = True)
    date = models.DateTimeField(auto_now_add = True)

class testB(models.Model):
    text = models.TextField()
    a = models.ForeignKey(testA)

And here is the query I want to build :

SELECT (extract(hour from testa.date)) AS hour, testb.text FROM testb INNER JOIN testa ON (testb.a_id = testa.id)

So here is how i build it in python :

testB.objects.all().select_related('a').extra(select = {'hour' : 'extract(hour from testa.date)'}).values('hour','text')

but django removes the select_related when he sees that I'm not using the "testa" table (because of the 'values' statement). So the resulting SQL query fails :

SELECT (extract(hour from testa.date)) AS "hour", "testb"."text" FROM "testb"

If I remove the "values" statement it works fine :

SELECT (extract(hour from testa.date)) AS "hour", "testb"."id", "testb"."text", "testb"."a_id", "testa"."id", "testa"."code", "testa"."date" FROM "testb" INNER JOIN "testa" ON ("testb"."a_id" = "testa"."id")

but I must put the values statement as I want to make aggregates as in "count the b objects grouped by the hour of the date in the a object" :

testB.objects.all().select_related('a').extra(select = {'hour' : 'extract(hour from testa.date)'}).values('hour').annotate(count = Count('pk'))

So what is the good way to achive this ? "Count objects grouped by something in another object" ? Or is there a way to "force" django to keep the "select_related" tables even if he thinks they are useless ?

PS : I know I could use the "tables" argument of the extra statement but in that case I would have to rewrite the join by myself and I want to benefit from the django ORM

+1  A: 

I have developped a Django app to solve this kind of problems : django-cube. The base idea is to emulate a multi-dimensional DB, in order to easily calculate aggregations.

The functionnality you require ('__hour' : a field-lookup for 'hour') is not implemented, but implementing it would take probably 15 minutes. So read what's next, how it works, etc ... and if it fits your need, write to me, and I'll implement it.

The examples on the main page and the api doc are not up-to-date (they will be in a few days), but these snippets are. If you want to give it a try, here is how you would solve this problem with django-cube :

#install the app first ...
from cube.models import Dimension, Cube

class MyCube(Cube):
    #declare a dimension called 'hour_a',
    #that is related to the field 'a__date__hour'.
    hour_a = Dimension(field='a__date__hour')

    #declare how to calculate the aggregation on a queryset
    @staticmethod
    def aggregation(queryset):
        return queryset.count()

And then, there are various methods to calculate the results, check in the snippets... You can for example use :

cube(testB.objects.all()).measure_dict('hour_a', full=False)

Which would return something like :

{   
    12: {measure: 889},
    13: {measure: 6654},
    14: {measure: 77},
    #<hour>: <count>
}

Also, don't take the featured download, rather check-out from source (branch 0.3).

I don't know what are your real needs, it might be a little heavy for the use you will have (it was initially made for data visualization purposes).

sebpiq
If it was initially made for data visualization, it's exactly what I'm looking for. The tool I'm trying to build is a tool to permit users to query the database by "writing" sentences like this : "I want to count ObjectA that have ObjectB.date > 20100101 and that have ObjectC.code = 'foo' groupe by ObjectA.code and ObjectC.category"The UI is ajax and when the user selects the first "model" to query, the filters and group by choices are filled with possible values.For now I'm trying with SQLAlchemy but I'll have a look at django-cubeThanks
Ghislain Leveque
A: 

I can't answer your main query, but it's worth noting that select_related has nothing to do with the __ syntax. select_related is simply an optimisation that will return extra related objects, adding joins to the query if necessary. But the double-underscore syntax for querying related tables works with or without select_related.

Daniel Roseman
A: 

I get the same error locally: everything works until "values" is appended and then django fails to place the appropriate FROM clause. I would post this example to the django-users group and see if someone in the know can verify if this is a bug or if there's a quick was to fix it.

ars