ansaurus

Question

Django: Ordering objects by their children's attributes

Answer 1

A:

 def remove_duplicates(seq): 
    seen = {}
    result = []
    for item in seq:
        if item in seen: continue
        seen[item] = 1
        result.append(item)
    return result


# Get the authors of the most recent books
query_result = Books.objects.order_by('pub_date').values('author')
# Strip the keys from the result set and remove duplicate authors
recent_authors = remove_duplicates(query_result.values())

DevDevDev 2009-11-07 07:50:12

Hey, looks like what I need. Just if I may ask:1. How does it work and2. How is it in terms of performance?

Adi 2009-11-07 07:51:28

Hmm actually sorry I dn't hink this will work, the .distinct() call will interact with the list, hang on I am fixing it for you.

DevDevDev 2009-11-07 07:52:24

There may be a better way, I could do this in SQL so I know you can do it using Django models but I can't think how off the top of my head.

DevDevDev 2009-11-07 07:58:57

Answer 2

A:

Or, you could play around with something like this:

Author.objects.filter(book__pub_date__isnull=False).order_by('-book__pub_date')

ayaz 2009-11-07 08:21:35

Won't work, there may be many books with the same author.

Dmitry Risenberg 2009-11-07 10:28:13

I am sorry, but why won't it work, regardless of whether an author has no books, one book, or more?

ayaz 2009-11-07 18:19:34

Answer 3

A:

from django.db.models import Max Author.objects.annotate(max_pub_date=Max('books__pub_date')).order_by('-max_pub_date')

this requires that you use django 1.1

and i assumed you will add a 'related_name' to your author field in Book model, so it will be called by Author.books instead of Author.book_set. its much more readable.

Ofri Raviv 2009-11-07 09:48:14

Answer 4

A:

Building on ayaz's solution, what about: Author.objects.filter(book__pub_date__isnull=False).distinct().order_by('-book__pub_date')

Josh Ourisman 2009-11-07 17:16:15

Answer 5

+1 A:

Lastly, would the option of just adding a new field to each one to show the date of the last book and just updating that the whole time be better?

Actually it would! This is a normal denormalization practice and can be done like this:

class Author(models.Model):
    name = models.CharField(max_length=200, unique=True)
    latest_pub_date = models.DateTimeField(null=True, blank=True)

    def update_pub_date(self):
        try:
            self.latest_pub_date = self.book_set.order_by('-pub_date')[0]
            self.save()
        except IndexError:
            pass # no books yet!

class Book(models.Model):
    pub_date = models.DateTimeField()
    author = models.ForeignKey(Author)

    def save(self, **kwargs):
        super(Book, self).save(**kwargs)
        self.author.update_pub_date()

    def delete(self):
        super(Book, self).delete()
        self.author.update_pub_date()

This is the third common option you have besides two already suggested:

doing it in SQL with a join and grouping
getting all the books to Python side and remove duplicates

Both these options choose to compute pub_dates from a normalized data at the time when you read them. Denormalization does this computation for each author at the time when you write new data. The idea is that most web apps do reads most often than writes so this approach is preferable.

One of the perceived downsides of this is that basically you have the same data in different places and it requires you to keep it in sync. It horrifies database people to death usually :-). But this is usually not a problem until you use your ORM model to work with dat (which you probably do anyway). In Django it's the app that controls the database, not the other way around.

Another (more realistic) downside is that with the naive code that I've shown massive books update may be way slower since they ping authors for updating their data on each update no matter what. This is usually solved by having a flag to temporarily disable calling update_pub_date and calling it manually afterwards. Basically, denormalized data requires more maintenance than normalized.

isagalaev 2009-11-07 19:18:11

ansaurus

tags:

views:

answers:

Django: Ordering objects by their children's attributes

related questions