ansaurus

Question

Error while using multiprocessing module in a python daemon

Answer 1

A:

I think there was a fix put into trunk and 2.6 maint a little while ago which should help with this can you try running your script in python-trunk or the latest 2.6-maint svn? I'm failing to pull up the bug information

jnoller 2009-09-01 01:01:06

Running the script with python 2.7 trunk produces the same outcome. I've added the test script I'm using to the original post. Am I doing something blatantly wrong?

Asif Rahman 2009-09-01 02:05:30

I'm not sure, I'd have to load a test that uses python-daemon to check it out. Nothing jumps out at me at the moment.

jnoller 2009-09-01 13:11:49

Answer 2

A:

Looks like your error is coming at the very end of your process -- your clue's at the very start of your traceback, and I quote...:

File "/usr/local/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)

if atexit._run_exitfuncs is running, this clearly shows that your own process is terminating. So, the error itself is a minor issue in a sense -- just from some function that the multiprocessing module registered to run "at-exit" from your process. The really interesting issue is, WHY is your main process exiting? I think this may be due to some uncaught exception: try setting the exception hook and showing rich diagnostic info before it gets lost by the OTHER exception caused by whatever it is that multiprocessing's registered for at-exit running...

Alex Martelli 2009-09-01 03:24:06

I wrapped the offending statement in a try..except block and I get the following traceback:Traceback (most recent call last): File "bugtest.py", line 32, in run if not process.is_alive(): File "/usr/local/lib/python2.6/multiprocessing/process.py", line 132, in is_alive self._popen.poll() File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll pid, sts = os.waitpid(self.pid, flag)OSError: [Errno 10] No child processes

Asif Rahman 2009-09-01 04:44:13

My guess is that the multiprocessing module is somehow disagreeing with the daemon double-fork. Unfortunately, I don't understand this material well enough to debug this.

Asif Rahman 2009-09-01 04:44:47

Answer 3

A:

I'm running into this also using the celery distributed task manager under RHEL 5.3 with Python 2.6. My traceback looks a little different but the error the same:

      File "/usr/local/lib/python2.6/multiprocessing/pool.py", line 334, in terminate
    self._terminate()
  File "/usr/local/lib/python2.6/multiprocessing/util.py", line 174, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/local/lib/python2.6/multiprocessing/pool.py", line 373, in _terminate_pool
    p.terminate()
  File "/usr/local/lib/python2.6/multiprocessing/process.py", line 111, in terminate
    self._popen.terminate()
  File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 136, in terminate
    if self.wait(timeout=0.1) is None:
  File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 121, in wait
    res = self.poll()
  File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll
    pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes

Quite frustrating.. I'm running the code through pdb now, but haven't spotted anything yet.

markhellewell 2009-09-14 06:25:02

Answer 4

A:

The original sample script has "import signal" but no use of signals. However, I had a script causing this error message and it was due to my signal handling, so I'll explain here in case its what is happening for others. Within a signal handler, I was doing stuff with processes (e.g. creating a new process). Apparently this doesn't work, so I stopped doing that within the handler and fixed the error. (Note: sleep() functions wake up after signal handling so that can be an alternative approach to acting upon signals if you need to do things with processes)

Dave Brondsema 2009-09-14 20:20:25

Answer 5

+2 A:

Your problem is a conflict between the daemon and multiprocessing modules, in particular in its handling of the SIGCLD (child process terminated) signal. daemon sets SIGCLD to SIG_IGN when launching, which, at least on Linux, causes terminated children to immediately be reaped (rather than becoming a zombie until the parent invokes wait()). But multiprocessing's is_alive test invokes wait() to see if the process is alive, which fails if the process has already been reaped.

Simplest solution is just to set SIGCLD back to SIG_DFL (default behaviour -- ignore the signal and let the parent wait() for the terminated child process):

def run():
    # ...

    signal.signal(signal.SIGCLD, signal.SIG_DFL)

    process = processing.Process(target=func)
    process.start()

    while True:
        # ...

Anthony Towns 2009-09-16 08:24:54

Bonus point for using the words, 'terminated', 'reaped', 'zombie', and 'reaped' again.

Adam Nelson 2010-07-23 14:04:21

Answer 6

+2 A:

Ignoring SIGCLD also causes problems with the subprocess module, because of a bug in that module (issue 1731717, still open as of 2009-09-18).

This behaviour is addressed in version 1.4.8 of the python-daemon library; it now omits the default fiddling with SIGCLD, so no longer has this unpleasant interaction with other standard library modules.

bignose 2009-09-17 15:45:43

ansaurus

tags:

views:

answers:

Error while using multiprocessing module in a python daemon

related questions