views:

55

answers:

4

I have a couple of models in a Rails app. Let's say it's Products and Sales.

A product has many sales, and a sale belongs to a product.

What I am looking to do is get a list of all products (possibly with conditions) and next to each product provide a count of the number of sales, similar to:

  • Product 1 (10)
  • Product 2 (22)

Whilst the output is fairly straightforward, I cannot figure out a nice way to gather the data without using SQL.

I should also add that I may want to apply a condition to the sales too.

+1  A: 

If you are properly declaring the associations, this is as simple as the following:

Product.find(:all, :include => :sales).each do |product|
  puts product.name + " (#{product.sales.size})"
end

EDIT:
Associations are rich collections that can be searched on if you need to apply a condition:

@products = Product.find(:all, :include => :sales)
@products.each do |product|
  # Find product sales only from yesterday
  yesterdays_sales = product.sales.find(:all, :conditions => { :created_at => Date.yesterday }
end
Josh Lindsey
And here you have `N+1` issue.
Dmytrii Nagirniak
There's no reason to optimize prematurely. Associations are The Rails Way and will be familiar to any Rails dev who looks at his code. He can always optimize later if this becomes a bottleneck.
Josh Lindsey
It is not a premature optimization. Just use `:include` or `:join`. You don't spend any time for that.Why should we write bad code in advance knowing it is bad if we can write better code with exactly the same effort now.
Dmytrii Nagirniak
You're right, I should have used `:include` in these. I've edited the post to account for this.
Josh Lindsey
Dmytrii Nagirniak
A: 

If you do not mind to bring the related Sales objects from the database into the application then something like this should do the job:

But keep an eye on it.

# Preload all the Sales for our Products, sow we won't end up with `N+1` issue.
selected_products = Product.all :joins => :sales, :conditions => {'orders.successful' => true}

# Then use it like a normal association:
selected_products.each { |p| puts p.sales.size }

Regards,
Dmitriy.

Dmytrii Nagirniak
Your solution has the N+1 issue. The `Product.all` call uses the `join` table in the `WHERE` condition. In your subsequent calls to p.sales a new query will be submitted to the database to get the Sales associated with the product. I have given detailed reply here: (http://stackoverflow.com/questions/2338605/getting-a-rails-model-with-statistics/2339708#2339708).
KandadaBoggu
@KandadaBoggu, Thanks a lot for that. I believed the `sales.size` would not execute a query as it is supposed to be loaded with `:join`.Need to go and check it out.
Dmytrii Nagirniak
No worries. Because of you I was able to confirm the suspicion I had all along about `joins`.
KandadaBoggu
+1  A: 

If you are mostly interested in counting sales per product you should go with counter_cache feature. This will ensure that you are always getting the count from the products table instead of JIT count calculations.

# products table should have a column called sales_count
class Product < ActiveRecord:Base
  has_many :sales
end

class Sales < ActiveRecord:Base
  belongs_to :product, :counter_cache => true
end

The rails will take care of incrementing/decrementing the sales_count column upon creation/deletion of sales. Now you can get the sales count by doing the following:

Product.first.sales_count

To conditionally count the sales on a single product object do the following:

Product.first.sales.count(:conditions => ["amount > ? ", 200])

To conditionally count a batch of products do the following:

#returns a ordered hash
product_sales_count = Product.count(:joins => [:sales], :conditions => 
            ["amount > ? ", 200], :group =>"products.id").each do |product_id, sales_count|
   p "Product #{product_id} = #{sales_count}"
end

# If you need the product objects get all of them in one call and match the count
products = Product.find(:all, :joins => [:sales], :conditions => 
             ["amount > ? ", 200], :group =>"products.id").each do |product|

   p "Product #{product.name} = #{product_sales_count[product.id]}"

end

This approach will save lot of trips to the database.

KandadaBoggu
`Product.first.sales.count(:conditions => ["amount > ? ", 200])` will execute a query per product. So what is the point of having :counter_cache if there is a need for conditional counts?
Dmytrii Nagirniak
It ends up with `N+1`.
Dmytrii Nagirniak
As I mentioned in my answer, he should choose counter_cache ONLY if he is mostly counting the sales without conditions. I have amended my answer to address the N+1 issue for the conditional scenario. Again it depends upon his use case.
KandadaBoggu
+1  A: 

This is not an answer but comparison of three answers for this question. I have already given my answer. There is confusion w.r.t the working of :joins and :include in ActiveRecord.find. So I spent some time analyzing the SQL log for three solutions.

Approach 1: Get the sales count using count

  #To return the sales count for each product
  ps_count_hash = Product.count(:joins => [:sales], :group => "products.id") # sql 1.1

  # To print the product id and sales count
  ps_count_hash.each do | product_id, sales_count|
   p "Product#{product_id} - (#{sales_count})"
  end

  # To print the product details and sales count
  # get all the products associated 
  Product.find_all_by_id(ps_count_hash.keys).each |product|  #sql 1.2
   p "Product[id = #{product.id}, name = #{product.name}] - (#{ps_count_hash[product.id]})"
  end

Approach 2: Get the products through a join

  Product.find(:all, :joins=>[:sales]).each do |product| #sql 2.1
   p "Product[id = #{product.id}, name = #{product.name}] - (#{product.sales.size})" # sql 2.2 - 2.(2+N)
  end

Approach 3: Get the products through a include

  Product.find(:all, :include=>[:sales]).each do |product| #sql 3.1 and 3.2
   p "Product[id = #{product.id}, name = #{product.name}] - (#{product.sales.size})" 
  end

Now lets look at the SQL statements generated by these three approaches

SQL statements for Approach 1 - 2 SQL

SELECT count(*) AS count_all, products.id AS products_id FROM `products` INNER JOIN `sales` ON sales.product_id = products.id GROUP BY products.id 
SELECT `products`.* FROM `products` WHERE (`products`.`id` IN (1,2))

SQL statements for Approach 2 - 2 SQL

SELECT * FROM `products` 
SELECT `sales`.* FROM `sales` WHERE (`sales`.product_id IN (1,2)) 

SQL statements for Approach 3 - N + 1 SQL

SELECT `products`.* FROM `products` INNER JOIN `sales` ON sales.product_id = products.id 
SELECT * FROM `sales` WHERE (`sales`.product_id = 1) 
SELECT * FROM `sales` WHERE (`sales`.product_id = 2) 

Best approach for counting product by sales ( with out sales details): Approach 1

Best approach for counting product by sales ( with sales details): Approach 2

Approach 3 has the N+1 problem. So it is out of contention.

KandadaBoggu