ansaurus

Question

Finding multiple instances of a tag with Django "through" field

Answer 1

+1 A:

I think I understood what you need. This one made me think, thanks! :-)

I believe the equivalent SQL query would be something like:

select t.name, s.name, count(user_id) count_of_users
  from yourapp_annot a, yourapp_tag t, yourapp_sample s 
 where a.tag_id = t.id 
   and s.id = a.sample_id
group by t.name, s.name
having count_of_users > 1

While I try hard not to think in SQL when I'm coming up with django model navigation (it tends to get in the way); when it comes to aggregation queries it always helps me to visualize what the SQL would be.

In django we now have aggregations.

Here is what I came up with:

models.Annot.objects.select_related().values(
  'tag__name','sample__name').annotate(
  count_of_users=Count('user__id')).filter(count_of_users__gt=1)

The result set will contain the tag, the sample, and the count of users that tagged said sample with said tag.

Breaking it apart for the folks that are not used to django aggregation:

models.Annot.objects.select_related()

select_related() is forcing all tables related to Annot to be retrieved in the same query
This is what will allow me to specify tag__name and sample__name in the values() call

values('tag__name','sample__name')

values() is limiting the fields to retrieve to tag.name and sample.name
This makes sure that my aggregation on count of clients will group by just these fields

annotate(count_of_users=Count('user__id'))

annotate() adds an aggregation as an extra field to a query

filter(count_of_users__gt=1)

And finally I filter on the aggregate count.

If you want to add an additional filter on what users should be taken into account, you need to do this:

models.Annot.objects.filter(user=[... list of users...]).select_related().values(
  'tag__name','sample__name').annotate(
  count_of_users=Count('user__id')).filter(count_of_users__gt=1)

I think that is it.

One thing... Notice that I used tag__name and sample__name in the query above. But your models do not specify that tag names and sample names are unique.

Should they be unique? Add a unique=True to the field definitions in the models.

Shouldn't they be unique? You need to replace tag__name and sample__name with tag__id and sample__id in the query above.

celopes 2009-12-17 17:58:38

looks great ... with a great explanation, checking it now

JudoWill 2009-12-17 18:10:51

works perfectly, couldn't have asked for anything better!

JudoWill 2009-12-17 18:17:05

Happy it helped. Check my edit about uniqueness, please.

celopes 2009-12-17 18:24:21

actually I changed it to:.values('tag','sample')which is equivalent to 'tag__id' and 'sample__id'

JudoWill 2009-12-17 18:35:57

ansaurus

tags:

views:

answers:

Finding multiple instances of a tag with Django "through" field

related questions