views:

1290

answers:

2

In mid July 2008 Memoization was added to Rails core. A demonstration of the usage is here.

I have not been able to find any good examples on when methods should be memoized, and the performance implications of each. This blog post, for example, suggests that oftentimes, memoization should not be used at all.

For something that could potentially have tremendous performance implications, there seem to be few resources that go beyond providing a simple tutorial.

Has anyone seen memoization used in their own projects? What factors would make you consider memoizing a method?

Edit:

After doing some more research on my own I found that memoization is used a remarkable number of times inside of Rails core.

Here's an example: http://github.com/rails/rails/blob/1182658e767d2db4a46faed35f0b1075c5dd9a88/actionpack/lib/action_view/template.rb.

This usage seems to go against the findings of the blog post above that found memoization can hurt performance.

+3  A: 

When a method fetches data from multiple tables, and perform some calculations before returning the resulting object, and this method is multiple times in requests, memoization might make sense.

Remember that query caching is also active, so only memoize methods which perform in-Ruby calculations, not pure database fetches.

laust.rud
Would building ActiveRecord objects count as calculation? As I understand it, query cache only caches the mysql result set and not the objects that are created (where the creation process often takes longer than the query itself).
Gdeglin
As far as I know, the query cache stores the actual ActiveRecord objects.
laust.rud
+7  A: 

I think many Rails developers don't fully understand what memoization does and how it works. I've seen it applied to methods that return lazy loaded collections (like a Sequel dataset), or applied to methods that take no arguments but calculate something based on instance variables. In the first case the memoization is nothing but overhead, and in the second it's a source of nasty and hard to track down bugs.

I would not apply memoization if

  • the returned value is merely slightly expensive to calculate. It would have to be very expensive, and not further optimizable, for it to be worth memoization.
  • the returned value is or could be lazy loaded
  • the method is not a pure function, i.e. it is guaranteed to return exactly the same value for the same arguments -- and only uses the arguments to do it's work, or other pure functions. Using instance variables or calling methods that in turn uses instance variables means that the method could return different results for the same arguments.

There are other situations too where memoization isn't appropriate, such as the one in the question and the answers above, but these are three that I think aren't as obvious.

The last item is probably the most important: memoization caches a result based on the arguments to the method, if the method looks like this it cannot be memoized:

def unmemoizable1(name)
  "%s was here %s" % name, Time.now.strftime('%Y-%m-%d')
end

def unmemoizable2
  find_by_shoe_size(@size)
end

Both can, however, be rewritten to take advantage of memoization (although in these two cases it should obviously not be done for other reasons):

def unmemoizable1(name)
  memoizable1(name, Time.now.strftime('%Y-%m-%d')
end

def memoizable1(name, time)
  "#{name} was here #{time}"
end
memoize :memoizable1

def unmemoizable2
  memoizable2(@size)
end

def memoizable2(size)
  find_by_shoe_size(size)
end
memoize :memoizable2

(assuming that find_by_shoe_size didn't have, or relied on, any side effects)

The trick is to extract a pure function from the method and apply memoization to that instead.

Theo