views:

48

answers:

1

Hey guys, in the following code:

  def process(batch_size=1000)
     claim_numbers.each_slice(batch_size) do |numbers_batch|
       claims = Claim.find(:all, :conditions => ["claim_number in (?)", numbers_batch])
       # do something with claims
     end
   end

In one of my Rails models I'm processing a lot of claim_numbers, I'm simulating a find_in_batches method in order to do not load a lot of records into memory, then my question is: in terms of memory, what happens with the claims variable in each iteration? when does the Ruby's GC release that portion of memory?

Any help and tips would be appreciated, thanks in advance!

Update: Using Ruby 1.8.7-p72

+1  A: 

Ruby will release the memory as soon as the GC runs. Since claims is scoped inside the each_slice block, claim will have no reference outside of the block and when claim is reassigned (because of a next iteration), the previously assigned objects become unreferenced. The memory of each object is retained until the GC kicks in. How often the GC runs can be specified with some environment variables (more info at http://blog.evanweaver.com/articles/2009/04/09/ruby-gc-tuning/).

If for some reason you retain the objects (because there is still a reference to it, e.g. you put the objects in an array or hash), the memory for that object is not released. If you monitor your app, you can see an increased memory usage, but also increasing CPU usage, since ruby's GC is non-generational, which means that it goes over all objects, every time, to see if they can be collected.

Wouter de Bie
Thanks! Gotta check that blog post! What techniques/tools do you use to monitor your app?
jpemberthy
We've switched to JRuby some time ago and this now allows us to use some nice Java monitoring tools (like JConsole and VisualGC). Moving to JRuby also allowed us to tune GC settings in a much better way.
Wouter de Bie