The idea is to do analytics of 30 or 2000 products out of a collection of 80,000 products.
Say, if there are 80,000 products, and to get the top products with highest number of pageviews in a category, which can include only 30 or up to 2000 products, so we can either filter out all those products first, and then use map/reduce to find the total pageviews, or I can use map/reduce to find pageviews for all 80,000 products, and then filter out the 30 or 2000 products.
Initially, it looks like it maybe better to do the filtering first, because the map/reduce will take a lot less time.
But how can MongoDB's Ruby driver or Mongoid filter out 30 or 2000 products? They all have different IDs, so do we use a condition, cond = {:id => 123456}
and keep on cond.merge!({:id => 876543})
for all 2000 products, and then do a where()
?
Another way is if we do a map/reduce for the 80,000 products, get back an array, and then use Ruby's select
or reject
to choose those items which are in the array of product IDs (in that category).
If it is SQL, then it can be done by something like sub-query.