tags:

views:

30

answers:

1

When I use extra in a certain way on a Django queryset (call it qs), the result of qs.count() is different than len(qs.all()). To reproduce:

Make an empty Django project and app, then add a trivial model:

class Baz(models.Model):
    pass

Now make a few objects:

>>> Baz(id=1).save()
>>> Baz(id=2).save()
>>> Baz(id=3).save()
>>> Baz(id=4).save()

Using the extra method to select only some of them produces the expected count:

>>> Baz.objects.extra(where=['id > 2']).count()
2
>>> Baz.objects.extra(where=['-id < -2']).count()
2

But add a select clause to the extra and refer to it in the where clause, and the count is suddenly wrong, even though the result of all() is correct:

>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).all()
[<Baz: Baz object>, <Baz: Baz object>]   # As expected
>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).count()
0   # Should be 2

I think the problem has to do with django.db.models.sql.query.BaseQuery.get_count(). It checks whether the BaseQuery's select or aggregate_select attributes have been set; if so, it uses a subquery. But django.db.models.sql.query.BaseQuery.add_extra adds only to the BaseQuery's extra attribute, not select or aggregate_select.

How can I fix the problem? I know I could just use len(qs.all()), but it would be nice to be able to pass the extra'ed queryset to other parts of the code, and those parts may call count() without knowing that it's broken.

A: 

Redefining get_count() and monkeypatching appears to fix the problem:

def get_count(self):
    """
    Performs a COUNT() query using the current filter constraints.
    """
    obj = self.clone()
    if len(self.select) > 1 or self.aggregate_select or self.extra:
        # If a select clause exists, then the query has already started to
        # specify the columns that are to be returned.
        # In this case, we need to use a subquery to evaluate the count.
        from django.db.models.sql.subqueries import AggregateQuery
        subquery = obj
        subquery.clear_ordering(True)
        subquery.clear_limits()

        obj = AggregateQuery(obj.model, obj.connection)
        obj.add_subquery(subquery)

    obj.add_count_column()
    number = obj.get_aggregation()[None]

    # Apply offset and limit constraints manually, since using LIMIT/OFFSET
    # in SQL (in variants that provide them) doesn't change the COUNT
    # output.
    number = max(0, number - self.low_mark)
    if self.high_mark is not None:
        number = min(number, self.high_mark - self.low_mark)

    return number

django.db.models.sql.query.BaseQuery.get_count = quuux.get_count

Testing:

>>> Baz.objects.extra(select={'negid': '0 - id'}, where=['"negid" < -2']).count()
2

Updated to work with Django 1.2.1:

def basequery_get_count(self, using):
    """
    Performs a COUNT() query using the current filter constraints.
    """
    obj = self.clone()
    if len(self.select) > 1 or self.aggregate_select or self.extra:
        # If a select clause exists, then the query has already started to
        # specify the columns that are to be returned.
        # In this case, we need to use a subquery to evaluate the count.
        from django.db.models.sql.subqueries import AggregateQuery
        subquery = obj
        subquery.clear_ordering(True)
        subquery.clear_limits()

        obj = AggregateQuery(obj.model)
        obj.add_subquery(subquery, using=using)

    obj.add_count_column()
    number = obj.get_aggregation(using=using)[None]

    # Apply offset and limit constraints manually, since using LIMIT/OFFSET
    # in SQL (in variants that provide them) doesn't change the COUNT
    # output.
    number = max(0, number - self.low_mark)
    if self.high_mark is not None:
        number = min(number, self.high_mark - self.low_mark)

    return number
models.sql.query.Query.get_count = basequery_get_count

I'm not sure if this fix will have other unintended consequences, however.

Vebjorn Ljosa