views:

78

answers:

3

I have three models that look something like this:

class Bucket < ActiveRecord::Base
  has_many :entries
end

class Entry < ActiveRecord::Base
  belongs_to :submission
  belongs_to :bucket
end

class Submission < ActiveRecord::Base
  has_many :entries
  belongs_to :user
end

class User < ActiveRecord::Base
   has_many :submissions
end

When I retrieve a collection of entries doing something like:

@entries = Entry.find(:all,
                      :conditions => ['entries.bucket_id = ?', @bucket],
                      :include    => :submission)

The performance is pretty quick although I get a large number of extra queries because the view uses the Submission.user object. However, if I add the user to the :include statement, the performance becomes terrible and it takes over a minute to return a total of 50 entries and submissions spread across 5 users. When I run the associated SQL commands, they complete in well under a second - the SQL query performance is the same from each set of queries.

@entries = Entry.find(:all,
                      :conditions => ['entries.bucket_id = ?', @bucket],
                      :include    => {:submission => :user})

Why would this second command have such terrible performance compared to the first?

A: 

it's because you have a double join in second statement. So the number of result is bigger.

More bigger the result is much slow it's.

shingara
Yes, but Ruby on Rails doesn't actually use INNER JOINs when it uses the :include statement. Instead it issues multiple select statements for each level of the query. That behavior changed in 2.0
Jeremiah Peschka
A: 

I'm not sure if the performance will be much better, but try:

@bucket.entries.find(:all, :include => {:submission => :user})

I'm really only familiar with MySQL, but if your database supports them, indexes will help performance significantly. I usually do this in my migrations like:

def self.up
  create_table :entries do |t|
    t.references :submission 
    t.references :bucket
    # other fields...
  end

  add_index :entries, :submission_id
  add_index :entries, :bucket_id
end
Beerlington
A: 

This ended up being a problem with the serialization/deserialization of the user model in the entire object graph. By caching relevant data on the Entry and Submission models we were able to avoid the lookup to User and saved a considerable amount of time.

Jeremiah Peschka