views:

150

answers:

3

I'm trying to avoid the N+1 queries problem with eager loading, but it's not working. The associated models are still being loaded individually.

Here are the relevant ActiveRecords and their relationships:

class Player < ActiveRecord::Base
  has_one :tableau
end

Class Tableau < ActiveRecord::Base
  belongs_to :player
  has_many :tableau_cards
  has_many :deck_cards, :through => :tableau_cards
end

Class TableauCard < ActiveRecord::Base
  belongs_to :tableau
  belongs_to :deck_card, :include => :card
end

class DeckCard < ActiveRecord::Base
  belongs_to :card
  has_many :tableaus, :through => :tableau_cards
end

class Card < ActiveRecord::Base
  has_many :deck_cards
end

class Turn < ActiveRecord::Base
  belongs_to :game
end

and the query I'm using is inside this method of Player:

def tableau_contains(card_id)
  self.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', self.tableau.id]
  contains = false
  for tableau_card in self.tableau.tableau_cards
    # my logic here, looking at attributes of the Card model, with        
    # tableau_card.deck_card.card;
    # individual loads of related Card models related to tableau_card are done here
  end
  return contains
end

Does it have to do with scope? This tableau_contains method is down a few method calls in a larger loop, where I originally tried doing the eager loading because there are several places where these same objects are looped through and examined. Then I eventually tried the code as it is above, with the load just before the loop, and I'm still seeing the individual SELECT queries for Card inside the tableau_cards loop in the log. I can see the eager-loading query with the IN clause just before the tableau_cards loop as well.

EDIT: additional info below with the larger, outer loop

EDIT2 : corrected loop below with tips from answers

EDIT3 : added more details in loop with goals

Here's the larger loop. It is inside an observer on after_save

def after_save(pa)
  turn = Turn.find(pa.turn_id, :include => :player_actions)
  game = Game.find(turn.game_id, :include => :goals)
  game.players.all(:include => [ :player_goals, {:tableau => [:tableau_cards => [:deck_card => [:card]]]} ])
  if turn.phase_complete(pa, players)  # calls player.tableau_contains(card)
    for goal in game.goals
      if goal.checks_on_this_phase(pa)
        if goal.is_available(players, pa, turn)
          for player in game.players
            goal.check_if_player_takes(player, turn, pa)
              ... # loop through player.tableau_cards
            end
          end
        end
      end
    end
  end

Here's the relevant code in the turn class:

def phase_complete(phase, players)
  all_players_complete = true
  for player in players
    if(!player_completed_phase(player, phase))
      all_players_complete = false
    end
  end
  return all_players_complete
end

the for player in game.players is doing another query to load the players. It is cached, I mean it has the CACHE label in the log, but I would've thought there would be no query at all because the game.players should already be loaded in memory.

Another snippet from the Goal model:

class Goal < ActiveRecord::Base
  has_many :game_goals
  has_many :games, :through => :game_goals
  has_many :player_goals
  has_many :players, :through => :player_goals

  def check_if_player_takes(player, turn, phase)
    ...
    for tab_card in player.tableau_cards
    ...
  end
end
+1  A: 

Problem number one is: You are resetting the player.tableau.tableau_cards everytime

player.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 

If that is supposed to be a temporary array then you are doing more work than necessary. The following would be better:

temp_tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 

I would also separate the two operations if you are actually trying to set the tableau_cards and do something to them.

player.tableau.tableau_cards = TableauCard.find :all, :include => [ {:deck_card => (:card)}], :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id] 
card.whatever_logic if player.tableau.tableau_cards.include? card

Again, it looks like you are doubling up on the query when you don't need to.

Sixty4Bit
Re: problem # 1 : I don't see how I'm 'resetting' the player.tableau.tableau_cards. They are not loaded before this, so I'm trying to load the tableau_cards along with eager loading their associated deck_cards and cards models one time for each player. Later on, yes I'm loading the player's tableau_cards again in the tableau_contains method. That was when I thought the original player's tableau_cards association had somehow got unloaded or gone out of scope, because I saw the individual cards being loaded in the tableau_contains method.
That's why I added the 2nd Tableau_card.find... in the player.tableau_contains method, to see if that would eliminate the individual Card models from being loaded individually, which it didn't. So the original question remains : in the player.tableau_contains method, why are the individual Card models being loaded, after being included in the TableauCard.find, :include => [ {:deck_card => (:card)}] query?
You can eager load the already DB-stored `tableau_cards` in `player.tableau` with `player.tableau.tableau_cards(:include => {:deck_card => :card})`. What @Sixty is saying is that you're trying to re-set them in the db by assigning to the association in `player.tableau.tableau_cards=` which usually fires off a SQL INSERT for the records which don't match `player.tableau.tableau_cards()`. That in turn could be reloading the association's records. "Loaded" here is a poor choice of words, as you could mean 'stored in the db' or 'need not be lazy-loaded from the db'.
Tim Snowhite
I think you, @Codeman are using 'loaded' and 'set' to mean 'loaded in memory' and @Sixty is using it to mean 'stored in the db'.
Tim Snowhite
Typo: `player.tableau.tableau_cards(:include => {:deck_card => :card})` should have `.all` like: `player.tableau.tableau_cards.all(:include => {:deck_card => :card})`
Tim Snowhite
+1  A: 

What happens if you seperate out the cards = TableauCard.find... call from the player.tableau.tableau_cards = cards call? Perhaps rails is resetting the association's cached records at that point in the code, and reloading the associations after that.

This would also allow you to make sure the same array is being passed into tableau_contains by passing the variable in explicitly.

It appears that you're trying to retain eager-loaded associations across multiple calls to the player.cards.tableau_cards association. I'm not sure if this functionality is possible with the way rails works. I believe it caches the raw data returned from a sql statement, but not the actual array that is returned. So:

  def test_association_identity
   a = player.tableau.tableau_cards.all(
          :include => {:deck_card => :card}) 
          #=> Array with object_id 12345
          # and all the eager loaded deck and card associations set up
   b = player.tableau.tableau_cards 
          #=> Array 320984230 with no eager loaded associations set up. 
          #But no extra sql query since it should be cached.
   assert_equal a.object_id, b.object_id #probably fails 
   a.each{|card| card.deck_card.card}
   puts("shouldn't have fired any sql queries, 
         unless the b call reloaded the association magically.")
   b.each{|card| card.deck_card.card; puts("should fire a query 
                                        for each deck_card and card")}
  end

The only other thing that I can think of to help is to scatter some output throughout the code and see exactly where the lazy loading is happening.

Here's what I mean:

#Observer

def after_save(pa)
  @game = Game.find(turn.game_id, :include => :goals)
  @game.players = Player.find( :all, 
                :include => [ {:tableau => (:tableau_cards)},:player_goals ], 
                :conditions => ['players.game_id =?', @game.id]
  for player in @game.players
    cards = TableauCard.find( :all, 
          :include =>{:deck_card => :card}, 
          :conditions => ['tableau_cards.tableau_id = ?', player.tableau.id])
    logger.error("First load")
    player.tableau.tableau_cards =  cards #See above comments as well.
    # Both sides of this ^ line should always be == since: 
    # Given player.tableau => Tableau(n) then Tableau(n).tableau_cards 
    # will all have tableau_id == n. In other words, if there are 
    # `tableau_cards.`tableau_id = n in the db (as in the find call),
    # then they'll already be found in the tableau.tableau_cards call.
    logger.error("Any second loads?")
    if(tableau_contains(cards,card))
       logger.error("There certainly shouldn't be any loads here.") 
       #so that we're not relying on any additional association calls, 
       #this should at least remove one point of confusion.
    ...
    end
  end
end

#Also in the Observer, for just these purposes (it can be moved back out 
#to Player after the subject problem here is understood better)

def tableau_contains(cards,card_id)
  contains = false
          logger.error("Is this for loop loading the cards?")
  for card in cards
           logger.error("Are they being loaded after `card` is set?")
    # my logic here, looking at attributes of the Card model, with        
    # card.deck_card.card;
    logger.error("What about prior to this call?")
  end
  return contains
end
Tim Snowhite
You're correct in interpreting my intent: I'm trying to read into memory the existing player's tableau cards from the db into memory. Thanks for figuring that out, and explaining how @Sixty and I were interpreting things differently in the comments below his answer. It never crossed my mind that I was updating the existing association. Looks like I'll have to use a temporary array like @Sixty suggests. I think I understand now, but I still find it odd that you can't eager load the existing association and use the player.tableau.tableau_cards reference.
I tried eager-loading the already stored tableau_cards with player.tableau.tableau_cards(:include => {:deck_card => :card}), from your comment below @Sixty's answer; it isn't loading the deck_cards or cards
Pardon, it's `player.tableau.tableau_cards.all(:include => {:deck_card => :card})`.
Tim Snowhite
Well, that worked to eager load the deck_cards and cards in the outer loop, but then in the player.tableau_contains method, when I do for tab_card in self.tableau.tableau_cards, it is re-loading the individual deck_cards and cards again; I can get rid of the eager loading by using a temp array in tableau_contains and looping through that everywhere it is used because it is caching the query and results; however I still don't understand why the player's tableau.tableau_cards isn't being reference in player.tableau_contains
+2  A: 

Try this:

class Game
  has_many :players
end

Change the logic of tableau_contains as follows:

class Player < ActiveRecord::Base
  has_one :tableau
  belongs_to :game

  def tableau_contains(card_id)
    tableau.tableau_cards.any?{|tc| tc.deck_card.card.id == card_id}
  end

end

Change the logic of after_save as follows:

def after_save(turn)
  game = Game.find(turn.game_id, :include => :goals))
  Rails.logger.info("Begin  eager loading..")                
  players = game.players.all(:include => [:player_goals,
            {:tableau => [:tableau_cards=> [:deck_card => [:card]]]} ])
  Rails.logger.info("End  eager loading..")                
  Rails.logger.info("Begin  tableau_contains check..")                
  if players.any?{|player| player.tableau_contains(turn.card_id)}
    # do something..                
  end
  Rails.logger.info("End  tableau_contains check..")                
end

Second line in the after_save method eager loads the data needed to perform the tableau_contains check. The calls such as tableau.tableau_cards and tc.deck_card.card should/will not hit the DB.

Issues in your code:

1) Assigning array to an has_many association

@game.players = Player.find :all, :include => ...

Statement above is not a simple assignment statement. It changes the palyers table rows with the game_id of the given game. I am assuming that is not what you want. If you check the DB table you will notice that the updated_time of the players table rows have changed after assignment.

You have to assign the value to a separate variable as shown in the code sample in after_save method.

2) Hand coded association SQL

Many places in the code you are hand coding the SQL for association data. Rails provides associations for this.

E.g:

tcards= TableauCard.find :all, :include => [ {:deck_card => (:card)}], 
         :conditions => ['tableau_cards.tableau_id = ?', self.tableau.id]

Can be rewritten as:

tcards = tableau.tableau_cards.all(:include => [ {:deck_card => (:card)}])

The tableau_cards cards association on Tableau model constructs the same SQL you have hand coded.

You can further improve the statement above by adding a has_many :through association to Player class.

class Player
  has_one :tableau
  has_many :tableau_cards, :through => :tableau
end

tcards = tableau_cards.all(:include => [ {:deck_card => (:card)}])

Edit 1

I created an application to test this code. It works as expected. Rails runs several SQL to eager load the data, i.e.:

Begin  eager loading..
SELECT * FROM `players` WHERE (`players`.game_id = 1) 
SELECT `tableau`.* FROM `tableau` WHERE (`tableau`.player_id IN (1,2))
SELECT `tableau_cards`.* FROM `tableau_cards` 
          WHERE (`tableau_cards`.tableau_id IN (1,2))
SELECT * FROM `deck_cards` WHERE (`deck_cards`.`id` IN (6,7,8,1,2,3,4,5))
SELECT * FROM `cards` WHERE (`cards`.`id` IN (6,7,8,1,2,3,4,5))
End  eager loading..
Begin  tableau_contains check..
End  tableau_contains check..

I don't see any SQL executed after eager loading the data.

Edit 2

Make the following change to your code.

def after_save(pa)
  turn = Turn.find(pa.turn_id, :include => :player_actions)
  game = Game.find(turn.game_id, :include => :goals)
  players = game.players.all(:include => [ :player_goals, {:tableau => [:tableau_cards => [:deck_card => [:card]]]} ])
  if turn.phase_complete(pa, game, players)
    for player in game.players
      if(player.tableau_contains(card))
      ...
      end
    end
  end
end
def phase_complete(phase, game, players)
  all_players_complete = true
  for player in players
    if(!player_completed_phase(player, phase))
      all_players_complete = false
    end
  end
  return all_players_complete
end

The caching works as follows:

game.players # cached in the game object
game.players.all # not cached in the game object

players = game.players.all(:include => [:player_goals])
players.first.player_goals # cached

The second statement above results in a custom association query. Hence AR doesn't cache the results. Where as player_goals are cached for every player object in the 3rd statement as they are fetched using standard association SQL.

KandadaBoggu
Do I even need to do the query for players? I've changed the for loop to be simply for player in game.players and it seems to work fine.
The query for `players` eager loads the data objects you need. Subsequent calls to `game.players` returns the cached list. You can combine two statements in to one, but for readability, I wrote two.
KandadaBoggu
I've used `game.players.all(:include => [ :player_goals, {:tableau => [:tableau_cards => [:deck_card => [:card]]]} ])` in the `after_save` and then in the `tableau_contains` I used your `tableau.tableau_cards.any?{|tc| tc.deck_card.card.id == card_id}` and it is still individually loading each deck_card and card in that loop in tableau_contains; I've double-checked to see if the player's tableau or tableau_cards association is being reset anywhere and I couldn't find any, so I still don't know why the deck_cards and cards are being queried individually after the eager load just above it
Updated my answer, take a look
KandadaBoggu
I have been seeing the same queries in the eager-loading operation, that is not the problem. The problem is that the queries are repeated later on, on objects that I thought were in memory. I've added more info in the original post showing one of these places where a query of the same info is being done.
Updated my answer. Take a look.
KandadaBoggu
I'm still unclear as to the caching difference between `game.players` and `game.players.all`, partially due to you saying to use `for player in game.players` in the after_save loop. But I made your change in the `phase_complete` method and that cleared those up. Now the repeated loads in the goals code, which I've added in the `after_save` loop. I have repeatedly looked for places in the `goal` model where the player.tableau_cards could be reset or unloaded in memory and can't find any. BTW thanks very much for your work and responses on this, very much appreciated.
Post your entire model code at http://pastie.org/, I can try to figure out the issue.
KandadaBoggu
It's at http://pastie.org/946688