views:

44

answers:

3

I'm building a Rails plugin that extracts a lot of data from Rails apps. It builds queries dynamically that span multiple tables, and there will be a lot of results returned. Here's an example of one of the queries built for pulling purchase data out of a Spree (Rails shopping cart):

select orders.user_id as from_id, variants.product_id as to_id from orders, line_items, variants where orders.user_id is not null and variants.product_id is not null and orders.id = line_items.order_id and line_items.variant_id = variants.id order by from_id;

The problem is that ActiveRecord loads up all results into memory. Is there a way to avoid that without dropping down to writing DB-specific code? (I'm aware of find_each, but that doesn't allow for sorting.)

+2  A: 

It's definitely possible, and here's an article that walks through it:

http://www.rubyinside.com/paginate_by_sql-rails-pagination-on-your-own-sql-queries-50.html

They have code you can copy verbatim, but the key is using SQL's count method. With this code, you can call paginate_by_sql instead of find_by_sql, and you pass in the page and per-page params.

Jaime Bellmyer
A: 

Maybe you also want to look into, Rails ActiveRecord batches, to retrieve records in smaller chunks:

http://guides.rubyonrails.org/active_record_querying.html#retrieving-multiple-objects-in-batches

Max Schulze
find_in_batches is just like find_each in that it doesn't support ordering (or custom SQL).
scotchi
A: 

Jaime's answer led me along a more productive path of queries landing at will_paginate, which offers a paginate_by_sql method. With a teency bit of effort, that was wrapped up in a paginated SQL each:

def paginated_sql_each(query, &block)
  page = 1
  begin
    results = paginate_by_sql(query, :page => page)
    results.each { |r| block.call(r) }
    page += 1
  end while !results.empty?
end
scotchi