views:

3974

answers:

2

This is more of a "why do things work this way" question rather than a "I don't know how to do this" question...

So the gospel on pulling associated records that you know you're going to use is to use :include because you'll get a join and avoid a whole bunch of extra queries:

Post.all(:include => :comments)

However when you look at the logs, there's no join happening:

Post Load (3.7ms)   SELECT * FROM "posts"
Comment Load (0.2ms)   SELECT "comments.*" FROM "comments" WHERE ("comments".post_id IN (1,2,3,4)) ORDER BY created_at asc)

It is taking a shortcut because it pulls all of the comments at once, but it's still not a join (which is what all the documentation seems to say). The only way I can get a join is to use :joins instead of :include:

Post.all(:joins => :comments)

And the logs show:

Post Load (6.0ms)  SELECT "posts".* FROM "posts" INNER JOIN "comments" ON "posts".id = "comments".post_id

Am I missing something? I have an app with half a dozen associations and on one screen I display data from all of them. Seems like it would be better to have one join-ed query instead of 6 individuals. I know that performance-wise it's not always better to do a join rather than individual queries (in fact if you're going by time spent, it looks like the two individual queries above are faster than the join), but after all the docs I've been reading I'm surprised to see :include not working as advertised.

Maybe Rails is cognizant of the performance issue and doesn't join except in certain cases?

+12  A: 

It appears that the :include functionality was changed with Rails 2.1. Rails used to do the join in all cases, but for performance reasons it was changed to use multiple queries in some circumstances. This blog post by Fabio Akita has some good information on the change (see the section entitled "Optimized Eager Loading").

Greg Campbell
That's exactly what I was looking for! So Rails _is_ being pretty smart! Thanks!
Rob Cameron
See: http://samsaffron.com/archive/2008/03/15/You+should+be+very+careful+when+using+ActiveRecord+eager+loading
Sam Saffron
+6  A: 

The difference between joins and include is that using the include statement generates a much larger SQL query loading into memory all the attributes from the other table(s).

For example, if you have a table full of comments and you use a :joins => users to pull in all the user information for sorting purposes, etc it will work fine and take less time than :include, but say you want to display the comment along with the users name, email, etc. To get the information using :joins, it will have to make separate SQL queries for each user it fetches, whereas if you used :include this information is ready for use.

Great example:

http://railscasts.com/episodes/181-include-vs-joins

holden