views:

469

answers:

4

What is the recommended way to terminate unexpectedly long running threads in python ? I can't use SIGALRM, since

Some care must be taken if both signals and threads are used in the same program. The fundamental thing to remember in using signals and threads simultaneously is: always perform signal() operations in the main thread of execution. Any thread can perform an alarm(), getsignal(), pause(), setitimer() or getitimer(); only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can’t be used as a means of inter-thread communication.Use locks instead.

Update: each thread in my case blocks -- it is downloading a web page using urllib2 module and sometimes operation takes too many time on an extremely slow sites. That's why I want to terminate such slow threads

+1  A: 

If you are trying to kill a thread whose code you do not have control over, it depends if the thread is in a blocking call or not. In my experience if the thread is properly blocking, there is no recommended and portable way of doing this.

I've run up against this when trying to work with code in the standard library (multiprocessing.manager I'm looking at you) with loops coded with no exit condition: nice!

There are some interuptable thread implementations out there (see here for an example), but then, if you have the control of the threaded code yourself, you should be able to write them in a manner where you can interupt them with a condition variable of some sort.

jkp
+1  A: 

Use synchronization objects and ask the thread to terminate. Basically, write co-operative handling of this.

If you start yanking out the thread beneath the python interpreter, all sorts of odd things can occur, and it's not just in Python either, most runtimes have this problem.

For instance, let's say you kill a thread after it has opened a file, there's no way that file will be closed until the application terminates.

Lasse V. Karlsen
+4  A: 

Since abruptly killing a thread that's in a blocking call is not feasible, a better approach, when possible, is to avoid using threads in favor of other multi-tasking mechanisms that don't suffer from such issues.

For the OP's specific case (the threads' job is to download web pages, and some threads block forever due to misbehaving sites), the ideal solution is twisted -- as it generally is for networking tasks. In other cases, multiprocessing might be better.

More generally, when threads give unsolvable issues, I recommend switching to other multitasking mechanisms rather than trying heroic measures in the attempt to make threads perform tasks for which, at least in CPython, they're unsuitable.

Alex Martelli
+3  A: 

As Alex Martelli suggested, you could use the multiprocessing module. It is very similar to the Threading module so that should get you off to a start easily. Your code could be like this for example:

import multiprocessing

def get_page(*args, **kwargs):
    # your web page downloading code goes here

def start_get_page(timeout, *args, **kwargs):
    p = multiprocessing.Process(target=get_page, args=args, kwargs=kwargs)
    p.start()
    p.join(timeout)
    if p.is_alive():
        # stop the downloading 'thread'
        p.terminate()
        # and then do any post-error processing here

if __name__ == "__main__":
    start_get_page(timeout, *args, **kwargs)

Of course you need to somehow get the return values of your page downloading code. For that you could use multiprocessing.Pipe or multiprocessing.Queue (or other ways available with multiprocessing). There's more information, as well as samples you could check at http://docs.python.org/library/multiprocessing.html.

Lastly, the multiprocessing module is included in python 2.6. It is also available for python 2.5 and 2.4 at pypi (you can use

easy_install multiprocessing

)

or just visit pypi and download and install the packages manually.

Note: I realize this has been posted awhile ago. I was having a similar problem to this and stumbled here and saw Alex Martelli's suggestion. Had it implemented for my problem and decided to share it. (I'd like to thank Alex for pointing me in the right direction.)

Vin-G