tags:

views:

488

answers:

3

I am trying to run autogenerated code (which might potentially not terminate) in a loop, for genetic programming. I'm trying to use multiprocessing pool for this, since I don't want the big performance overhead of creating a new process each time, and I can terminate the pool process if it runs too long (which i cant do with threads).

Essentially, my program is

if __name__ == '__main__':    
    pool = Pool(processes=1)            
    while ...:
        source = generate() #autogenerate code
        exec(source)
        print meth() # just a test, prints a result, since meth was defined in source
        result = pool.apply_async(meth)
        try:
            print result.get(timeout=3)  
        except:
           pool.terminate()

This is the code that should work, but doesn't, instead i get

AttributeError: 'module' object has no attribute 'meth'

It seems that Pool only sees the meth method, if it is defined in the very top level. Any suggestions how to get it to run dynamically created method?

Edit: the problem is exactly the same with Process, i.e.

source = generated()
exec(source)
if __name__ == '__main__':    
    p = Process(target = meth)
    p.start()

works, while

if __name__ == '__main__':    
    source = generated()
    exec(source)
    p = Process(target = meth)
    p.start()

doesn't, and fails with an AttributeError

+1  A: 

Did you read the programming guidelines? There is lots of stuff in there about global variables. There are even more limitations under Windows. You don't say which platform you are running on, but this could be the problem if you are running under Windows. From the above link

Global variables

Bear in mind that if code run in a child process tries to access a global variable, then the value it sees (if any) may not be the same as the value in the parent process at the time that Process.start() was called.

However, global variables which are just module level constants cause no problems.

Nick Craig-Wood
+1  A: 

Process (via pool or otherwise) won't have a __name__ of '__main__', so it will not execute anything that depends on that condition -- including the exec statements that you depend on in order to find your meth, of course.

Why are you so keen on having that exec guarded by a condition that, by design, IS going to be false in your sub-process, yet have that sub-process depend (contradictorily!) on the execution of that exec...?! It's really boggling my mind...

Alex Martelli
Well, I'm not keen on having the exec guarded per se, I just don't see how i can dynamically load/override a method and then run it in a new process or pool. Any suggestions are greatly appreciated.
Ash
well, on Linux Lenny with Python 2.5 every exemples above works fine and subprocesses does have a __main__ (inherited env ?). It seems behavior is os specific here. Does the problem occurs on windows ?
kriss
@Ash, what happens if you move your `exec` to outside the `if __name__` guard? Another possibility would be to generate your code into a foobar.py file (in a directory on sys.path) and then import it. @kriss, yes, if you can focus on one single platform you may be a bit laxer, but for cross-platform compatibility you need to program to the "least common denominator" (which sadly IS often Windows;-).
Alex Martelli
@Alex, everything works fine if I exec outside the guard in the topmost level. The only problem is, I can do it only once in the outer level, but I need to keep loading freshly generated code continuously (genetic programming). Also, I cant run Process or Pool outside the __name__ guard, at least not on windows.
Ash
@Ash, so consider generating the code dynamically into _new_ .py files to be imported, as I mention as "another possibility" in my last comment. And, you CAN of course run Process (in a function of course, NOT top-level! keep all meaningful code execution in functions...) outside the `__name__` guard as long as that function, only, is called only in `__main__`.
Alex Martelli
A: 

As I commented above, all your examples are working as you expect on my Linux box (Debian Lenny, Python2.5, processing 0.52, see test code below).

There seems to be many restrictions on windows on objects you can transmit from one process to another. Reading the doc pointed out by Nick it seems that on window the os missing fork it will run a brand new python interpreter import modules and pickle/unpickle objects that should be passed around. If they can't be pickled I expect that you'll get the kind of problem that occured to you.

Hence a complete (not) working example may be usefull for diagnosis. The answer may be in the things you've hidden as irrelevant.

from processing import Pool
import os

def generated():
    return (
"""
def meth():
    import time
    starttime = time.time()
    pid = os.getpid()
    while 1:
        if time.time() - starttime > 1:
            print "process %s" % pid
            starttime = starttime + 1

""")


if __name__ == '__main__':
    pid = os.getpid()
    print "main pid=%s" % pid
    for n in range(5):
        source = generated() #autogenerate code
        exec(source)
        pool = Pool(processes=1)            
        result = pool.apply_async(meth)
        try:
            print result.get(timeout=3)  
        except:
           pool.terminate()

Another suggestion would be to use threads. yes you can even if you don't know if your generated code will stop or not or if your generated code have differently nested loops. Loops are no restriction at all, that's precisely a point for using generators (extracting control flow). I do not see why it couldn't apply to what you are doing. [Agreed it is probably more change that independent processes] See example below.

import time

class P(object):
    def __init__(self, name):
        self.name = name
        self.starttime = time.time()
        self.lastexecutiontime = self.starttime
        self.gen = self.run()

    def toolong(self):
        if time.time() - self.starttime > 10:
            print "process %s too long" % self.name
            return True
        return False

class P1(P):
    def run(self):
        for x in xrange(1000):
            for y in xrange(1000):
                for z in xrange(1000):
                    if time.time() - self.lastexecutiontime > 1:
                        print "process %s" % self.name
                        self.lastexecutiontime = self.lastexecutiontime + 1
                        yield
        self.result = self.name.uppercase()

class P2(P):
    def run(self):
        for x in range(10000000):
            if time.time() - self.lastexecutiontime > 1:
                print "process %s" % self.name
                self.lastexecutiontime = self.lastexecutiontime + 1
                yield
        self.result = self.name.capitalize()

pool = [P1('one'), P1('two'), P2('three')]
while len(pool) > 0:
    current = pool.pop()
    try:
        current.gen.next()
    except StopIteration:
        print "Thread %s ended. Result '%s'" % (current.name, current.result) 
    else:
        if current.toolong():
            print "Forced end for thread %s" % current.name 
        else:
            pool.insert(0, current)
kriss
Indeed the platform is windows, and the code fails there with AttributeError. Unfortunately, I need to be cross-platform.About using threads, I'm not sure what you mean. The code I dynamically load is generated/morphed randomly (genetic programming). It can include multiple arbitrary/nested while loops, so it is not an option to force all while loops to be yielding. The only option I can see is monitoring the method call, and killing it if it runs too long, which cant be done with threads AFAIK. Or did you have something else in mind?
Ash
nesting is irrelevant. But, agreed there is some restriction on using generators (the problem I see in my example is to propagate yields througout functions that are not themselves generators). However I believe all problems for your kind of use are solved in PEP342 http://www.python.org/dev/peps/pep-0342/
kriss