tags:

views:

1213

answers:

4

What's the best and/or fastest method of doing multijoin queries in Django using the ORM and QuerySet API?

+2  A: 

If you are trying to join across tables linked by ForeignKeys or ManyToManyField relationships then you can use the double underscore syntax. For example if you have the following models:

class Foo(models.Model):
    name = models.CharField(max_length=255)

class FizzBuzz(models.Model):
    bleh = models.CharField(max_length=255)

class Bar(models.Model):
    foo = models.ForeignKey(Foo)
    fizzbuzz = models.ForeignKey(FizzBuzz)

You can do something like:

Fizzbuzz.objects.filter(bar_set__foo__name = "Adrian")
ordord00
that doesn't work:FieldError: Cannot resolve keyword 'bar_set' into field. Choices are: bar, bleh, id
naw
use: Fizzbuzz.objects.filter(bar__foo__name = "Adrian")
naw
This may have have changed in the 1.5 years since I posted it originally. Thanks for the fix/update naw!
ordord00
+1  A: 

Don't use the API ;-) Seriously, if your JOIN are complex, you should see significant performance increases by dropping down in to SQL rather than by using the API. And this doesn't mean you need to get dirty dirty SQL all over your beautiful Python code; just make a custom manager to handle the JOINs and then have the rest of your code use it rather than direct SQL.

Also, I was just at DjangoCon where they had a seminar on high-performance Django, and one of the key things I took away from it was that if performance is a real concern (and you plan to have significant traffic someday), you really shouldn't be doing JOINs in the first place, because they make scaling your app while maintaining decent performance virtually impossible.

Here's a video Google made of the talk: http://www.youtube.com/watch?v=D-4UN4MkSyI&feature=PlayList&p=D415FAF806EC47A1&index=20

Of course, if you know that your application is never going to have to deal with that kind of scaling concern, JOIN away :-) And if you're also not worried about the performance hit of using the API, then you really don't need to worry about the (AFAIK) miniscule, if any, performance difference between using one API method over another.

Just use: http://docs.djangoproject.com/en/dev/topics/db/queries/#lookups-that-span-relationships

Hope that helps (and if it doesn't, hopefully some true Django hacker can jump in and explain why method X actually does have some noticeable performance difference).

machineghost
"if your JOIN are complex, you should see significant performance increases by dropping down in to SQL" - this makes no sense as a blanket statement. It may be true that the ORM creates poor SQL in certain edge cases, but for most JOINs it writes the same SQL you would write.
Carl Meyer
+1  A: 

Use the queryset.query.join method, but only if the other method described here (using double underscores) isn't adequate.

bnjmnhggns
A: 

Caktus blog has an answer to this: http://www.caktusgroup.com/blog/2009/09/28/custom-joins-with-djangos-queryjoin/

Basically there is a hidden QuerySet.query.join method that allows adding custom joins.

extropy