views:

93

answers:

3

I've seen a lot of questions related to this... but my code works on python 2.6.2 and fails to work on python 2.6.5. Am I wrong in thinking that the whole atexit "functions registered via this module are not called when the program is killed by a signal" thing shouldn't count here because I'm catching the signal and then exiting cleanly? What's going on here? Whats the proper way to do this?

import atexit, sys, signal, time, threading

terminate = False
threads = []

def test_loop():
    while True:
        if terminate:
            print('stopping thread')
            break
        else:
            print('looping')
            time.sleep(1)

@atexit.register
def shutdown():
    global terminate
    print('shutdown detected')
    terminate = True
    for thread in threads:
        thread.join()

def close_handler(signum, frame):
    print('caught signal')
    sys.exit(0)

def run():
    global threads
    thread = threading.Thread(target=test_loop)
    thread.start()
    threads.append(thread)

    while True:
        time.sleep(2)
        print('main')

signal.signal(signal.SIGINT, close_handler)

if __name__ == "__main__":
    run()

python 2.6.2:

$ python halp.py 
looping
looping
looping
main
looping
main
looping
looping
looping
main
looping
^Ccaught signal
shutdown detected
stopping thread

python 2.6.5:

$ python halp.py 
looping
looping
looping
main
looping
looping
main
looping
looping
main
^Ccaught signal
looping
looping
looping
looping
...
looping
looping
Killed <- kill -9 process at this point

The main thread on 2.6.5 appears to never execute the atexit functions.

A: 

I'm not sure if this was entirely changed, but this is how I have my atexit done in 2.6.5


atexit.register(goodbye)

def goodbye():
    print "\nStopping..."
Falmarri
since 2.6 atexit.register can be used as a decorator.
lostincode
Hmm, well that's odd. Are you sure you're running the same code and it's not cached somewhere else or something weird like that?
Falmarri
+2  A: 

Exiting due to a signal is not the same as exiting from within a signal handler. Catching a signal and exiting with sys.exit is a clean exit, not an exit due to a signal handler. So, yes, I agree that it should run atexit handlers here--at least in principle.

However, there's something tricky about signal handlers: they're completely asynchronous. They can interrupt the program flow at any time, between any VM opcode. Take this code, for example. (Treat this as the same form as your code above; I've omitted code for brevity.)

import threading
lock = threading.Lock()
def test_loop():
    while not terminate:
        print('looping')
        with lock:
             print "Executing synchronized operation"
        time.sleep(1)
    print('stopping thread')

def run():
    while True:
        time.sleep(2)
        with lock:
             print "Executing another synchronized operation"
        print('main')

There's a serious problem here: a signal (eg. ^C) may be received while run() is holding lock. If that happens, your signal handler will be run with the lock still held. It'll then wait for test_loop to exit, and if that thread is waiting for the lock, you'll deadlock.

This is a whole category of problems, and it's why a lot of APIs say not to call them from within signal handlers. Instead, you should set a flag to tell the main thread to shut down at an appropriate time.

do_shutdown = False
def close_handler(signum, frame):
    global do_shutdown
    do_shutdown = True
    print('caught signal')

def run():
    while not do_shutdown:
        ...

My preference is to avoid exiting the program with sys.exit entirely and to explicitly do cleanup at the main exit point (eg. the end of run()), but you can use atexit here if you want.

Glenn Maynard
+3  A: 

The root difference here is actually unrelated to both signals and atexit, but rather a change in the behavior of sys.exit.

Before around 2.6.5, sys.exit (more accurately, SystemExit being caught at the top level) would cause the interpreter to exit; if threads were still running, they'd be terminated, just as with POSIX threads.

Around 2.6.5, the behavior changed: the effect of sys.exit is now essentially the same as returning from the main function of the program. When you do that--in both versions--the interpreter waits for all threads to be joined before exiting.

The relevant change is that Py_Finalize now calls wait_for_thread_shutdown() near the top, where it didn't before.

This behavioral change seems incorrect, primarily because it no longer functions as documented, which is simply: "Exit from Python." The practical effect is no longer to exit from Python, but simply to exit the thread. (As a side note, sys.exit has never exited Python when called from another thread, but that obscure divergance from documented behavior doesn't justify a much bigger one.)

I can see the appeal of the new behavior: rather than two ways to exit the main thread ("exit and wait for threads" and "exit immediately"), there's only one, as sys.exit is essentially identical to simply returning from the top function. However, it's a breaking change and diverges from documented behavior, which far outweighs that.

Because of this change, after sys.exit from the signal handler above, the interpreter sits around waiting for threads to exit and then runs atexit handlers after they do. Since it's the handler itself that tells the threads to exit, the result is a deadlock.

Glenn Maynard
Many thanks, Glenn. Now that I know what to look for, I find the relevant Python issue report [here](http://bugs.python.org/issue1722344). I agree that it is a big change that should have been done in more than a minor point release.
Muhammad Alkarouri