views:

19

answers:

0

I have a ShowRating model which keeps a log of every show rated on by a user and have been keeping an average rating and ratings count cached inside my Show model, so I can access it in the show controller without rebuilding this data on the fly. I've been keeping an eye on the SQL queries executed and I'm curious about why Rails chose to execute a few of them.

First, my models:

# show.rb
class Show < ActiveRecord::Base
  has_many :show_ratings
  ...
end

# user.rb
class User < ActiveRecord::Base
  has_many :show_ratings
  ...
end

# show_rating.rb
class ShowRating < ActiveRecord::Base
  belongs_to :show
  belongs_to :user

  validates_associated :show, :on => :create, :message => "That is an invalid show"
  validates_associated :user, :on => :create, :message => "That is an invalid user"
  validates_uniqueness_of :show_id, :scope => :user_id, :on => :create, :message => "That show has already been rated by this user"

  after_destroy :calculate_average
  after_create :calculate_average
  after_update :calculate_average

  private

  def calculate_average
    show = Show.find(show_id)
    ratings = show.show_ratings

    show.avg_rating = ratings.average(:rating).to_f
    show.avg_rating_count = ratings.count
    show.save
  end
end

Now running ShowRating.create!(:show_id => 1, :user_id => 10, :rating => 7) from the rails console, produces the following in the log file:

SQL (0.1ms)  BEGIN
Show Load (0.1ms)  SELECT `shows`.* FROM `shows` WHERE (`shows`.`id` = 1) LIMIT 1
User Load (0.3ms)  SELECT `users`.* FROM `users` WHERE (`users`.`id` = 10) LIMIT 1
ShowRating Load (0.2ms)  SELECT `show_ratings`.`id` FROM `show_ratings` WHERE (`show_ratings`.`show_id` = 1) AND (`show_ratings`.`user_id` = 10) LIMIT 1
SQL (0.2ms)  INSERT INTO `show_ratings` (`comment`, `created_at`, `rating`, `show_id`, `updated_at`, `user_id`) VALUES (NULL, '2010-10-13 22:46:37', 7, 1, '2010-10-13 22:46:37', 10)
Show Load (0.1ms)  SELECT `shows`.* FROM `shows` WHERE (`shows`.`id` = 1) LIMIT 1
SQL (0.2ms)  SELECT AVG(`show_ratings`.`rating`) AS avg_id FROM `show_ratings` WHERE (`show_ratings`.show_id = 1)
SQL (0.2ms)  SELECT COUNT(*) AS count_id FROM (SELECT 1 FROM `show_ratings` WHERE (`show_ratings`.show_id = 1)) AS subquery
SQL (0.4ms)  UPDATE `shows` SET `avg_rating` = 15.4545, `avg_rating_count` = 11, `updated_at` = '2010-10-13 22:46:37' WHERE (`shows`.`id` = 1)
SQL (0.5ms)  COMMIT

That's 8 queries. Right off the bat, I'm wondering why it's selecting the values for shows.* twice and why it's even bothering with the user, who never gets touched. Are the 'Load' queries done because of the ShowRating associations defined? Rails doesn't seem to be smart enough not to query all the show data again when I perform the find in calculate_average.

In fact, I already know the show_id that I need to perform my search through ShowRatings, so it shouldn't need do that lookup either. I thought I'd be able to do something like show_id.show_ratings instead, and it's smart enough to interpret that integer value as the key to look up all of the show rating, but no dice.

It's also unnecessarily loading data from the ShowRating model in the 3rd query, which I never access.

I did try and run .increment on the user's avg_rating_count, thinking it would be smart enough to not have to run that COUNT(*), but it still turned up in the logs. Perhaps decrement calls it behind the scenes?

So really, my main question is, is this just a side effect of behind the scenes work that Rails is doing, or is there something I could be doing to optimize?

Side question: My validates_associated doesn't work for my user data. I can specify a non existent user_id and don't get back an error, even though doing so for a non existing show_id returns one. Oddly enough though, the show validation doesn't use the error message I provided, instead it still uses the generic Couldn't find Show with ID=1000

I understand that validates_associated runs validations on other associated models, but that doesn't even sound like what I want. I just want to make sure that the show and user references in new ratings exist based on the IDs provided.

Finally, is there a more succinct way to define after_create/update/destroy callbacks that all go to the same method, instead of doing it in 3 separate lines like I did?