views:

485

answers:

2

A Python function object has an attribute dictionary called func_dict which is visible from outside the function and is mutable, but which is not modified when the function is called. (I learned this from answers to a question I asked yesterday (#1753232): thanks!) I was reading code (at http://pythonprogramming.jottit.com/functional%5Fprogramming) which memoized the computation of Fibonacci numbers and thought, "Why not use the func_dict attribute for memoizing?" It worked (see below; the output's at the end of the code.). It's a little like having a class property available but having the initialization code outside the object (in this case, not a class but a function).

I wonder what similar (or dissimilar) tricks can be done using this attribute?

def fib(n):
    if n in fib.cache:
        print "found fib.cache[%d] = %d: " %(n, fib.cache[n])
        return fib.cache[n]
    else:
        print "fib.cache[%d] = fib(%d) + fib(%d)" % (n, n-1, n-2)
        fib.cache[n] = fib(n-1) + fib(n-2)
        print "modified fib.cache: ", fib.cache
        return fib.cache[n]

fib.cache = {0:0, 1:1}

for x in range(7):
    print "==================>", x
    print fib( x)

"""
==================> 0
found fib.cache[0] = 0: 
0
==================> 1
found fib.cache[1] = 1: 
1
==================> 2
fib.cache[2] = fib(1) + fib(0)
found fib.cache[1] = 1: 
found fib.cache[0] = 0: 
modified fib.cache:  {0: 0, 1: 1, 2: 1}
1
==================> 3
fib.cache[3] = fib(2) + fib(1)
found fib.cache[2] = 1: 
found fib.cache[1] = 1: 
modified fib.cache:  {0: 0, 1: 1, 2: 1, 3: 2}
2
==================> 4
fib.cache[4] = fib(3) + fib(2)
found fib.cache[3] = 2: 
found fib.cache[2] = 1: 
modified fib.cache:  {0: 0, 1: 1, 2: 1, 3: 2, 4: 3}
3
==================> 5
fib.cache[5] = fib(4) + fib(3)
found fib.cache[4] = 3: 
found fib.cache[3] = 2: 
modified fib.cache:  {0: 0, 1: 1, 2: 1, 3: 2, 4: 3, 5: 5}
5
==================> 6
fib.cache[6] = fib(5) + fib(4)
found fib.cache[5] = 5: 
found fib.cache[4] = 3: 
modified fib.cache:  {0: 0, 1: 1, 2: 1, 3: 2, 4: 3, 5: 5, 6: 8}
8
"""
+4  A: 

Just be careful: the fib.cache trick only works if fib is indeed the name of the relevant function object in the scope that's active while it's executing (for example, when you decorate it as you have done, you must assign the starting value for the cache to the decorator wrapper, not to the decorated function -- and if it gets further decorated after that, things break).

This is a bit fragile compared to the standard memoization idiom:

def fib(n, _memo={0:1, 1:1}):
    if n in _memo:
        return _memo[n]
    else:
        _memo[n] = fib(n-1) + fib(n-2)
        return _memo[n]

or the decorator equivalent. The standard idiom's also faster (though not by much) -- putting them both in mem.py under names fib1 (the .cache-trick one, without prints and undecorated) and fib2 (my version), we see...:

$ python -mtimeit -s'import mem' 'mem.fib1(20)'
1000000 loops, best of 3: 0.754 usec per loop
$ python -mtimeit -s'import mem' 'mem.fib2(20)'
1000000 loops, best of 3: 0.507 usec per loop

so the standard-idiom version saves about 33% of the time, but that's when almost all calls do actually hit the memoization cache (which is populated after the first one of those million loops) -- fib2's speed advantage is smaller on cache misses, since it comes from the higher speed of accessing _memo (a local variable) vs fib.cache (a global name, fib, and then an attribute thereof, cache), and cache accesses dominate on cache hits (there's nothing else;-) but there's a little extra work (equal for both functions) on cache misses.

Anyway, don't mean to rain on your parade, but when you find some new cool idea be sure to measure it against the "good old way" of doing things, both in terms of "robustness" and performance (if you're caching, presumably you care about performance;-).

Alex Martelli
Hi, Alex. No rain at all. First, I was happy to see I could get it to work at all. Second, I was wondering how to time it, and what to compare it against, not knowing the standard form. So, as usual, you've added lots to my understanding. There's something I don't understand, I see. I thought that the parameters were local variables that had no 'memory' from previous calls, yet here you've got _memo being modified in one call and then being used as modified in subsequent calls. How does that work?
behindthefall
`_memo` isn't a local variable - it's an optional parameter. If you observe the function `def t(n, o=[]): o.append(n); return o` you'll see that `t(1), t(2), t(3)` returns `[1], [1, 2], [1, 2, 3]`. This is because the assignment of optional parameters is done once, at the function's definition, and because `[]` is a list (and is mutable). If you don't want this to happen, you'd have to do `def t(n, o=None): if o == None: o = []; o.append(n); return o` which would produce the expected `[1], [2], [3]` output in the above test. However, this unintuitive behavior can be utilized for memoization.
Chris Lutz
@Chris, arguments **are** local variables -- they live in the local variable namespace, both logically and physically. Default values evaluation at `def`-time (which is unintuitive only to raw Python beginners;-) is useful for many purposes, of which memoization is just one -- forcing early (def-time) binding (while free-variables "of course" are late bound) being the most frequently used one.
Alex Martelli
@Chris Lutz - Holy Cow! Unintuitive behavior of optional parameters: I didn't see that one coming! Thank you VERY much for your explanation. I like it even more than the function dictionary. (What else does Python have hidden under the hood?)
behindthefall
Errr, I didn't know I was a 'raw' Python beginner ... rare tending to medium, maybe? No, seriously, this is stuff that hasn't jumped out at me from the books.
behindthefall
behindthefall, you may enjoy browsing this thread: http://stackoverflow.com/questions/101268/hidden-features-of-python
unutbu
@~unutbu - That's practically a book! Does it have a TOC? ;-)
behindthefall
Memoization is the kindest way to learn about how optional arguments are initialized :)
wisty
@behindthefall: the thing about when default function values are evaluated is in the Python tutorial: http://docs.python.org/tutorial/controlflow.html#defining-functions . Perhaps the books don't cover it because it's in the tutorial.
Kylotan
@Kylotan - Thanks for that tip. I have to admit that I haven't gone through the tutorial conscientiously, because I had all these Python books. (But it's probably in there, too, and I just didn't have a mental slot to fit it into at the time ...) Live and learn ...
behindthefall
A: 

I've always used this technique for memoizing. It uses the fact that you can call an object that's not a function, so long as that object has its __call__ method defined. Neither behindthefall's method, nor Alex Martelli's even occurred to me, for whatever reason. I would guess the performance is roughly the same as behindthefall's way, because it works roughly the same way. Maybe a little slower. But as the linked page points out, "the definition for fib() is now the "obvious" one, without the cacheing code obscuring the algorithm" which is kind of nice.

MatrixFrog