views:

187

answers:

2

I have a user model and a cd model connected through a join table 'cds_users'. I'm trying to return a hash of users plus each cd they have in common with the original user.

@user.users_with_similar_cds(1,4,5)
# => {:bob => [4], :tim => [1,5]}

Is there a better/faster way of doing this without looping so much? Maybe a more direct way?

def users_with_similar_cds(*args)
  similar_users = {}
  Cd.find(:all, :conditions => ["cds.id IN (?)", args]).each do |cd|
    cd.users.find(:all, :conditions => ["users.id != ?", self.id]).each do |user|
      if similar_users[user.name]
        similar_users[user.name] << cd.id
      else
        similar_users[user.name] = [cd.id]
      end
    end
  end
  similar_users
end

[addition]

Taking the join model idea, I could do something like this. I'll call the model 'joined'.

def users_with_similar_cds(*args)
  similar_users = {}
  Joined.find(:all, :conditions => ["user_id != ? AND cd_id IN (?)", self.id, args]).each do |joined|
    if similar_users[joined.user_id]
      similar_users[joined.user_id] << cd_id
    else
      similar_users[joined.user_id] = [cd_id]
    end
  end
  similar_users
end

Would this be the fastest way on large data sets?

A: 

Yap, you can, with only 2 selects:

Make a join table model named CdUser (use has_many.. through)

# first select
cd_users = CdUser.find(:all, :conditions => ["cd_id IN (?)", args])
cd_users_by_cd_id = cd_users.group_by{|cd_user| cd_user.cd_id }

users_ids = cd_users.collect{|cd_user| cd_user.user_id }.uniq
#second select
users_by_id = User.find_all_by_id(users_ids).group_by{|user| user.id}

cd_users_by_cd_id.each{|cd_id, cd_user_hash| 
  result_hash[:cd_id] = cd_users_hash.collect{|cd_user| users_by_id[cd_user.user_id]}
}

This is just an ideea, haven't tested :)

FYI: http://railscasts.com/episodes/47-two-many-to-many

Vlad Zloteanu
+1  A: 

You could use find_by_sql on the Users model, and Active Record will dynamically add methods for any extra fields returned by the query. For example:

similar_cds = Hash.new
peeps = Users.find_by_sql("SELECT Users.*, group_concat(Cds_Users.cd_id) as cd_ids FROM Users, Cds_Users GROUP BY Users.id")
peeps.each { |p| similar_cds[p.name] = p.cd_ids.split(',') }

I haven't tested this code, and this particular query will only work if your database supports group_concat (eg, MySQL, recent versions of Oracle, etc), but you should be able to do something similar with whatever database you use.

John Hyland
In my tests, this hasn't been faster. Maybe on larger data sets it would be.
MediaJunkie
Hrm. It seems likely that it would be - in the original code, you do one query to get the cds and then another query for each user found. The find_by_sql variant does the whole shebang in one query, though it is a join with a group_concat. If your dataset is small, the fact that single-table queries are faster than joins might even things out, but once you get a lot of similar users, the overhead of all those queries will probably bog you down. You might also check to make sure you have indexes on all the foo_id columns.
John Hyland