I have a collection of geography related classes: State, Msa, County, City, etc. The classes all descend from a base Location class. The classes are related, for the most part, through a denormalized join table called geography. So I have...
class Geography < ActiveRecord::Base
belongs_to :state
belongs_to :county
belongs_to :city
# ... etc.
end
class Location < ActiveRecord::Base
end
class State < Location
has_many :geographies
has_many :msas, :through => :geographies, :uniq => true
# ... etc.
end
class Msa < Location
has_many :geographies
has_many :states, :through => :geographies, :uniq => true
# ... etc.
end
Now, when I run the following from the console:
>> msas = Msa.find(:all, :include=>"states", :conditions=>{"states_locations"=>{"id"=>"1"}})
I get back the correct number of results (13 in this case). However, running the SQL that this find call produces I get back 1,000s of results (again the geography table is a datamart of sorts which is why I'm using the :uniq option on the association).
SELECT `locations`.`id` AS t0_r0,
`locations`.`parent_id` AS t0_r1,
`locations`.`type` AS t0_r2,
`locations`.`name` AS t0_r3,
`states_locations`.`id` AS t1_r0,
`states_locations`.`parent_id` AS t1_r1,
`states_locations`.`type` AS t1_r2,
`states_locations`.`name` AS t1_r3
FROM `locations`
LEFT OUTER JOIN `geography`
ON `locations`.`id` = `geography`.`msa_id`
LEFT OUTER JOIN `locations` states_locations
ON `states_locations`.`id` = `geography`.`state_id`
AND `states_locations`.`type` = 'State'
WHERE `states_locations`.`id` = '1'
AND `locations`.`type` = 'Msa'
I assume this means that Rails is loading 1,000s of records into memory AND THEN, in Ruby, cutting down the results to the distinct set of Msas (in this case); seems a bit inefficient. Furthermore, the following subsequent calls return varying results:
>> msas.first.states.size # incorrect count
=> 192
>> msas.first.states.count # correct count
=> 1
>> msas.first.states # incorrect number of State objects
=> [#<State id: 1, ... >, ..., #<State id: 1, ... >]
>> msas.first.reload.states
=> [#<State id: 1, ... >] # correct number of State objects
My questions are:
- Why isn't Rails using DISTINCT in the query it produces from the find call? I'm guessing it's because I've asked it to :include => :states. Should I use :joins instead?
- Why is Rails returning non-unique results when calling msas.first.states? Shouldn't the association having a :uniq => true enforce uniqueness on the results?
- Why do I need to use the table alias that Rails uses for the states "version" of the locations table, i.e., :conditions => { :states_locations => { :id => 1 }}? Rails doesn't seem to understand :include => :states, :conditions => { :states => { :id => 1 }}. Is there a way to deterministically predict the table alias?
Any insights would be much appreciated.
Thanks in advance, Jason