views:

75

answers:

1

I first ran into this when trying to determine the relative performance of two generators:

t = timeit.repeat('g.get()', setup='g = my_generator()')

So I dug into the timeit module and found that the setup and statement are evaluated with their own private, initially empty namespaces so naturally the binding of g never becomes accessible to the g.get() statement. The obvious solution is to wrap them into a class, thus adding to the global namespace.

I bumped into this again when attempting, in another project, to use the multiprocessing module to divide a task among workers. I even bundled everything nicely into a class but unfortunately the call

pool.apply_async(runmc, arg) 

fails with a PicklingError because buried inside the work object that runmc instantiates is (effectively) an assignment:

self.predicate = lambda x, y: x > y

so the whole object can't be (understandably) pickled and whereas:

def foo(x, y): 
    return x > y
pickle.dumps(foo)

is fine, the sequence

bar = lambda x, y: x > y

yields True from callable(bar) and from type(bar), but it Can't pickle <function <lambda> at 0xb759b764>: it's not found as __main__.<lambda>.

I've given only code fragments because I can easily fix these cases by merely pulling them out into module or object level defs. The bug here appears to be in my understanding of the semantics of namespace use in general. If the nature of the language requires that I create more def statements I'll happily do so; I fear that I'm missing an essential concept though. Why is there such a strong reliance on the global namespace? Or, what am I failing to understand?

Namespaces are one honking great idea -- let's do more of those!

+3  A: 

The pickle protocol(s) would have a serious problem picking classes and functions in the most general case; by pickling them "by name" instead, it makes the difficulty go away, but nds up requiring that they be bound to (and recoverable by) names that are top-level in a module (which, since a module is its own namespace, doesn't conflict with "namespaces are one honking great idea", after all;-).

As for your timeit problem, I don't undestand what you mean wrt "the global namespace" - for example:

>>> timeit.repeat('g.get(23)', 'g = {}')
[0.29134988784790039, 0.27160286903381348, 0.27237796783447266]

the namespace where g is bound leaves the binding fully accessible by the statement being repeated. If what you're binding to g is a generator, maybe your problem is that generators don't have .get() methods and perhaps you meant .next()?

Alex Martelli
Thanks, it was indeed module namespaces that I was failing to consider. From a C background, symbols declared at file scope have extern storage class; this is disanalogous to Python modules. And yes, I did mean '.next()' but my typing fingers thought they knew better.
msw