views:

320

answers:

3

Certain functions in my code take a long time to return. I don't need the return value and I'd like to execute the next lines of code in the script before the slow function returns. More precisely, the functions send out commands via USB to another system (via a C++ library with SWIG) and once the other system has completed the task, it returns an "OK" value. I have reproduced the problem in the following example. How can I make "tic" and "toc" print one after the other without any delay? I suppose the solution involves threads, but I am not too familiar with them. Can anyone show me a simple way to solve this problem?

from math import sqrt
from time import sleep

def longcalc():
    total = 1e6
    for i in range(total):
        r = sqrt(i)
    return r

def longtime():
    #Do stuff here
    sleep(1)
    return "sleep done"

print "tic"
longcalc()
print "toc"
longtime()
print "tic"
+1  A: 
from threading import Thread
# ... your code    

calcthread = Thread(target=longcalc)
timethread = Thread(target=longtime)

print "tic"
calcthread.start()
print "toc"
timethread.start()
print "tic"

Have a look at the python threading docs for more information about multithreading in python.

A word of warning about multithreading: it can be hard. Very hard. Debugging multithreaded software can lead to some of the worst experiences you will ever have as a software developer.

So before you delve into the world of potential deadlocks and race conditions, be absolutely sure that it makes sense to convert your synchronous USB interactions into ansynchronous ones. Specifically, ensure that any code dependent upon the async code is executed after it has been completed (via a callback method or something similar).

Johnny G
+5  A: 

Unless the SWIGged C++ code is specifically set up to release the GIL (Global Interpreter Lock) before long delays and re-acquire it before getting back to Python, multi-threading might not prove very useful in practice. You could try multiprocessing instead:

from multiprocessing import Process

if __name__ == '__main__':
    print "tic"
    Process(target=longcalc).start()
    print "toc"
    Process(target=longtime).start()
    print "tic"

multiprocessing is in the standard library in Python 2.6 and later, but can be separately downloaded and installed for versions 2.5 and 2.4.

Edit: the asker is of course trying to do something more complicated than this, and in a comment explains: """I get a bunch of errors ending with: "pickle.PicklingError: Can't pickle <type 'PySwigObject'>: it's not found as __builtin__.PySwigObject". Can this be solved without reorganizing all my code? Process was called from inside a method bound to a button to my wxPython interface."""

multiprocessing does need to pickle objects to cross process boundaries; not sure what SWIGged object exactly is involved here, but, unless you can find a way to serialize and deserialize it, and register that with the copy_reg module, you need to avoid passing it across the boundary (make SWIGged objects owned and used by a single process, don't have them as module-global objects particularly in __main__, communicate among processes with Queue.Queue through objects that don't contain SWIGged objects, etc).

The early errors (if different than the one you report "ending with") might actually be more significant, but I can't guess without seeing them.

Alex Martelli
Dave Beazly stated in his GIL presentation that the GIL is released during I/O (and he talks about I/O, waiting for USB info). Current situation: two archpythonistas giving contrary information. What am I supposed to do now?
bayer
You still don't know what the C++ code is doing with respect to the GIL. You don't know if it does I/O and you don't know if it releases the GIL. This is not contradictory advice; this is "what to do when you don't know something" advice. Use multiprocessing when you don't know.
S.Lott
@S.Lott is right: the I/O Beazley's talking about is the one performed via Python's standard library -- a mysterious SWIGged C++ library may or may not cooperate;-). [[See http://matt.eifelle.com/2007/11/23/enabling-thread-support-in-swig-and-python/ for the only doc I know about that mentions GIL handling with SWIG -- not a well-documented feature;-)]]
Alex Martelli
The example code doesn't work. Can someone edit it? I believe Process must be called with the target keyword like so: "Process(target=longcalc).start()" and that the main code should be enclosed with "if __name__ == '__main__':" on Windows. The binaries are also directly available here: http://pypi.python.org/pypi/multiprocessing
JcMaco
@JcMaco, yes, thanks, you're right -- Process does insist on keyword-only args (so does Thread, in docs, but less forcefully;-) and on Windows also requires that the __main__ module be safely importable; +1, and fixing the answer accordingly, thanks again!
Alex Martelli
Thanks for the edits. Unfortunately, when trying to implement this solution, I get a bunch of errors ending with: "pickle.PicklingError: Can't pickle <type 'PySwigObject'>: it's not found as __builtin__.PySwigObject". Can this be solved without reorganizing all my code? Process was called from inside a method bound to a button to my wxPython interface.
JcMaco
@Alex. I didn't see your edit until now. Here's the full traceback: http://paste.dprogramming.com/dpood1nr Can you provide more information on serialization of SWIG objects? Thanks!
JcMaco
@JcMajo, unfortunately PySwigObject is a very broad type:-(. But if you only have one such case in your app, you can use `copy_reg` to register the way its instances should be serialized... otherwise you might want to look for more advanced ways of Pythonizing C/C++ code, such as SIP (see http://www.riverbankcomputing.co.uk/software/sip/intro).
Alex Martelli
A: 

You can use a Future, which is not included in the standard library, but very simple to implement:

from threading import Thread, Event

class Future(object):
    def __init__(self, thunk):
        self._thunk = thunk
        self._event = Event()
        self._result = None
        self._failed = None
        Thread(target=self._run).start()

    def _run(self):
        try:
            self._result = self._thunk()
        except Exception, e:
            self._failed = True
            self._result = e
        else:
            self._failed = False
        self._event.set()

    def wait(self):
        self._event.wait()
        if self._failed:
            raise self._result
        else:
            return self._result

You would use this particular implementation like this:

import time

def work():
    for x in range(3):
        time.sleep(1)
        print 'Tick...'
    print 'Done!'
    return 'Result!'

def main():
    print 'Starting up...'
    f = Future(work)
    print 'Doing more main thread work...'
    time.sleep(1.5)
    print 'Now waiting...'
    print 'Got result: %s' % f.wait()

Unfortunately, when using a system that has no "main" thread, it's hard to tell when to call "wait"; you obviously don't want to stop processing until you absolutely need an answer.

With Twisted, you can use deferToThread, which allows you to return to the main loop. The idiomatically equivalent code in Twisted would be something like this:

import time

from twisted.internet import reactor
from twisted.internet.task import deferLater
from twisted.internet.threads import deferToThread
from twisted.internet.defer import inlineCallbacks

def work():
    for x in range(3):
        time.sleep(1)
        print 'Tick...'
    print 'Done!'
    return 'Result!'

@inlineCallbacks
def main():
    print 'Starting up...'
    d = deferToThread(work)
    print 'Doing more main thread work...'
    yield deferLater(reactor, 1.5, lambda : None)
    print "Now 'waiting'..."
    print 'Got result: %s' % (yield d)

although in order to actually start up the reactor and exit when it's finished, you'd need to do this as well:

reactor.callWhenRunning(
    lambda : main().addCallback(lambda _: reactor.stop()))
reactor.run()

The main difference with Twisted is that if more "stuff" happens in the main thread - other timed events fire, other network connections get traffic, buttons get clicked in a GUI - that work will happen seamlessly, because the deferLater and the yield d don't actually stop the whole thread, they only pause the "main" inlineCallbacks coroutine.

Glyph