I think I understood what you need. This one made me think, thanks! :-)
I believe the equivalent SQL query would be something like:
select t.name, s.name, count(user_id) count_of_users
from yourapp_annot a, yourapp_tag t, yourapp_sample s
where a.tag_id = t.id
and s.id = a.sample_id
group by t.name, s.name
having count_of_users > 1
While I try hard not to think in SQL when I'm coming up with django model navigation (it tends to get in the way); when it comes to aggregation queries it always helps me to visualize what the SQL would be.
In django we now have aggregations.
Here is what I came up with:
models.Annot.objects.select_related().values(
'tag__name','sample__name').annotate(
count_of_users=Count('user__id')).filter(count_of_users__gt=1)
The result set will contain the tag, the sample, and the count of users that tagged said sample with said tag.
Breaking it apart for the folks that are not used to django aggregation:
models.Annot.objects.select_related()
- select_related() is forcing all tables related to Annot to be retrieved in the same query
- This is what will allow me to specify
tag__name
and sample__name
in the values() call
values('tag__name','sample__name')
- values() is limiting the fields to retrieve to tag.name and sample.name
- This makes sure that my aggregation on count of clients will group by just these fields
annotate(count_of_users=Count('user__id'))
- annotate() adds an aggregation as an extra field to a query
filter(count_of_users__gt=1)
- And finally I filter on the aggregate count.
If you want to add an additional filter on what users should be taken into account, you need to do this:
models.Annot.objects.filter(user=[... list of users...]).select_related().values(
'tag__name','sample__name').annotate(
count_of_users=Count('user__id')).filter(count_of_users__gt=1)
I think that is it.
One thing... Notice that I used tag__name and sample__name in the query above. But your models do not specify that tag names and sample names are unique.
Should they be unique? Add a unique=True
to the field definitions in the models.
Shouldn't they be unique? You need to replace tag__name and sample__name with tag__id and sample__id in the query above.