views:

755

answers:

3

I just started Python and I've got no idea what memoization is and how to use it. Also, may I have a simplified example?

+15  A: 

Memoization effectively refers to remembering ("memoization" -> "memorandum" -> to be remembered) results of method calls based on the method inputs and then returning the remembered result rather than computing the result again. You can think of it as a cache for method results. For further details, see page 339 of Cormen et al., Introduction To Algorithms.

A simple example for computing factorials using memoization in Python would be something like this:

factorial_memo = {}
def factorial(k):
    if k < 2: return 1
    if not k in factorial_memo:
        factorial_memo[k] = k * factorial(k-1)
    return factorial_memo[k]

You can get more complicated and encapsulate the memoization process into a class

class Memoize:
    def __init__(self, f):
        self.f = f
        self.memo = {}
    def __call__(self, *args):
        if not args in self.memo:
            self.memo[args] = self.f(*args)
        return self.memo[args]

Then:

def factorial(k):
    if k < 2: return 1
    return k * factorial(k - 1)

factorial = Memoize(factorial)
Jason
`has_key` shouldn't be used - use `key in some_dict` instead
nosklo
You're right: `in` is more Pythonic than `has_key`. Further, `in` is slightly faster than `has_key` because of method-call overhead in the latter. Lastly, `has_key` was removed in Python 3.x. Thanks for the comment.
Jason
I test your two examples in IPython's %timeit. When I use the dictionary (the first example) most of the time I get the first calls executed faster than the memorized second calls. Could you check in your system too? There is a great speed-ups (up to 40X) when I tested the second example. Also could you tell me how to access self.memo name later when the execution is finished?
Gökhan Sever
OK, please ignore the first part of my question because IPython's timeit makes multiple calls while timing the execution of a function. However the self.memo part is still valid :)
Gökhan Sever
That's a damn good answer. The first bit of Python code explains _everything_. +1.
paxdiablo
+5  A: 

Memoization is keeping the results of expensive calculations and returning the cached result rather than continuously recalculating it.

Here's an example:

def doSomeExpensiveCalculation(self, input):
    if input not in self.cache:
        <do expensive calculation>
        self.cache[input] = result
    return self.cache[input]

A more complete description can be found in the wikipedia entry on memoization.

Bryan Oakley
Hmm, now if that was correct Python, it would rock, but it appears not to be... okay, so "cache" is not a dict? Because if it is, it should be`if input not in self.cache`and`self.cache[input]`(`has_key` is obsolete since... early in the 2.x series, if not 2.0. `self.cache(index)` was never correct. IIRC)
jae
d'oh! Thanks. I've fixed it. I'm still a relative python newbie and still forget the simplest things now and then.
Bryan Oakley
+1 for your one-sentence description – clear and succinct. More so than e.g. the leading sentence in the Wikipedia article currently.
Jonik
+2  A: 

The other answers cover what it is quite well. I'm not repeating that. Just some points that might be useful to you.

Usually, memoisation is an operation you can apply on any function that computes something (expensive) and returns a value. Because of this, it's often implemented as a decorator. The implementation is straightforward and it would be something like this

memoised_function = memoise(actual_function)

or expressed as a decorator

@memoise
def actual_function(arg1, arg2):
   #body
Noufal Ibrahim