views:

127

answers:

1

I have two models, associated with a HABTM (actually using has_many :through on both ends, along with a join table). I need to retrieve all ModelAs that is associated with BOTH of two ModelBs. I do NOT want all ModelAs for ModelB_1 concatenated with all ModelAs for ModelB_2. I literally want all ModelAs that are associated with BOTH ModelB_1 and ModelB_2. It is not limited to only 2 ModelBs, it may be up to 50 ModelBs, so this must scale.

I can describe the problem using a variety of analogies, that I think better describes my problem than the previous paragraph:

* Find all books that were written by all 3 authors together.
* Find all movies that had the following 4 actors in them.
* Find all blog posts that belonged to BOTH the Rails and Ruby categories for each post.
* Find all users that had all 5 of the following tags: funny, thirsty, smart, thoughtful, and quick.   (silly example!)
* Find all people that have worked in both San Francisco AND San Jose AND New York AND Paris in their lifetimes.

I've thought of a variety of ways to accomplish this, but they're grossly inefficient and very frowned upon.

Taking an analogy above, say the last one, you could do something like query for all the people in each city, then find items in each array that exist across each array. That's a minimum of 5 queries, all the data of those queries transfered back to the app, then the app has to intensively compare all 5 arrays to each other (loops galore!). That's nasty, right?

Another possible solution would be to chain the finds on top of each other, which would essentially do the same as above, but won't eliminate the multiple queries and processing. Also, how would you dynamicize the chain if you had user submitted checkboxes or values that could be as high as 50 options? Seems dirty. You'd need a loop. And again, that would intensify the search duration.

Obviously, if possible, we'd like to have the database perform this for us, so, people have suggested to me that I simply put multiple conditions in. Unfortunately, you can only do an OR with HABTM typically.

Another solution I've run across is to use a search engine, like sphinx or UltraSphinx. For my particular situation, I feel this is overkill, and I'd rather avoid it. I still feel there should be a solution that will let a user craft a query for an arbitrary number of ModelBs and find all ModelAs.

How would you solve this problem?

+1  A: 

You may do this:

  1. build a query from your ModelA, joining ModelB (through the join model), filtering the ModelBs that have one of the values that you are looking for, that is putting them in OR (i.e. where ModelB = 'ModelB_1' or ModelB = 'ModelB_2'). With this query the result set will have multiple 'ModelA' rows, exactly one row for each ModelB condition satisfied.

  2. add a group by condition to the query on the ModelA columns you need (even all of them if you wish). The count() for each row is equal to the number of ModelB conditions satisfied*.

  3. add a 'having' condition selecting only the rows whose count(*) is equal to the number of ModelB conditions you need to have satisfied

example:

model_bs_to_find = [100, 200]
ModelA.all( :joins=>{:model_a_to_b=>:model_bs}, 
            :group=>"model_as.id", 
            :select=>"model_as.*",
            :conditions=>["model_bs.id in (?)", model_bs_to_find], 
            :having=>"count(*)=#{model_bs_to_find.size}")

N.B. the group and select parameters specified in that way will work in MySQL, the standard SQL way to do so would be to put the whole list of model_as columns in both the group and select parameters.

LucaM
Amazing! It works great. Adding the GROUP BY and HAVING in this fashion to the SQL query worked perfectly. I'll be testing the Ruby way illustrated above in a short while.
Kevin Elliott
Is there any way to reference the join table in the :joins statement without using a symbol for a model reference? My join table exists in the database, but it is not a model (since I'm using HABTM rather than has_many :through).
Kevin Elliott
:joins => :model_bs seemed to work well!
Kevin Elliott