views:

462

answers:

4

When I kick off a python script from within another python script using the subprocess module, a zombie process is created when the subprocess "completes". I am unable to kill this subprocess unless I kill my parent python process.

Is there a way to kill the subprocess without killing the parent? I know I can do this by using wait(), but I need to run my script with no_wait().

A: 

I'm not entirely sure what you mean by no_wait(). Do you mean you can't block waiting for child processes to finish? Assuming so, I think this will do what you want:

os.wait3(os.WNOHANG)
Daniel Stutzbach
A: 

Not using Popen.communicate() or call() will result in a zombie process.

If you don't need the output of the command, you can use subprocess.call():

>>> import subprocess
>>> subprocess.call(['grep', 'jdoe', '/etc/passwd'])
0

If the output is important, you should use Popen() and communicate() to get the stdout and stderr.

>>> from subprocess import Popen, PIPE
>>> process = Popen(['ls', '-l', '/tmp'], stdout=PIPE, stderr=PIPE)
>>> stdout, stderr = process.communicate()
>>> stderr
''
>>> print stdout
total 0
-rw-r--r-- 1 jdoe jdoe 0 2010-05-03 17:05 bar
-rw-r--r-- 1 jdoe jdoe 0 2010-05-03 17:05 baz
-rw-r--r-- 1 jdoe jdoe 0 2010-05-03 17:05 foo
David Narayan
Thanks for your comment. Unforuntately, communicate waits for processes to complete before spawning a new one. I need to run numberous processes in parallel.
Dave
+1  A: 

A zombie process is not a real process; it's just a remaining entry in the process table until the parent process requests the child's return code. The actual process has ended and requires no other resources but said process table entry.

We probably need more information about the processes you run in order to actually help more.

However, in the case that your Python program knows when the child processes have ended (e.g. by reaching the end of the child stdout data), then you can safely call process.wait():

import subprocess

process= subprocess.Popen( ('ls', '-l', '/tmp'), stdout=subprocess.PIPE)

for line in process.stdout:
        pass

subprocess.call( ('ps', '-l') )
process.wait()
print "after wait"
subprocess.call( ('ps', '-l') )

Example output:

$ python so2760652.py
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S   501 21328 21326  0  80   0 -  1574 wait   pts/2    00:00:00 bash
0 S   501 21516 21328  0  80   0 -  1434 wait   pts/2    00:00:00 python
0 Z   501 21517 21516  0  80   0 -     0 exit   pts/2    00:00:00 ls <defunct>
0 R   501 21518 21516  0  80   0 -   608 -      pts/2    00:00:00 ps
after wait
F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S   501 21328 21326  0  80   0 -  1574 wait   pts/2    00:00:00 bash
0 S   501 21516 21328  0  80   0 -  1467 wait   pts/2    00:00:00 python
0 R   501 21519 21516  0  80   0 -   608 -      pts/2    00:00:00 ps

Otherwise, you can keep all the children in a list, and now and then .poll for their return codes. After every iteration, remember to remove from the list the children with return codes different than None (i.e. the finished ones).

ΤΖΩΤΖΙΟΥ
Oh, I see now that my answer is basically the code version of your last paragraph.
Peter Lyons
+1  A: 

I'm not sure what you mean "I need to run my script with no_wait()", but I think this example does what you need. Processes will not be zombies for very long. The parent process will only wait() on them when they are actually already terminated and thus they will quickly unzombify.

#!/usr/bin/env python2.6
import subprocess
import sys
import time

children = []
#Step 1: Launch all the children asynchronously
for i in range(10):
    #For testing, launch a subshell that will sleep various times
    popen = subprocess.Popen(["/bin/sh", "-c", "sleep %s" % (i + 8)])
    children.append(popen)
    print "launched subprocess PID %s" % popen.pid

#reverse the list just to prove we wait on children in the order they finish,
#not necessarily the order they start
children.reverse()
#Step 2: loop until all children are terminated
while children:
    #Step 3: poll all active children in order
    children[:] = [child for child in children if child.poll() is None]
    print "Still running: %s" % [popen.pid for popen in children]
    time.sleep(1)

print "All children terminated"

The output towards the end looks like this:

Still running: [29776, 29774, 29772]
Still running: [29776, 29774]
Still running: [29776]
Still running: []
All children terminated
Peter Lyons
1. `.poll()` returns `returncode` so you could use `if p.poll(): ...`. 2. you don't need `p.wait()` if `p.poll()` is not None 3. you could remove items inplace http://stackoverflow.com/questions/2793324/has-any-simply-way-to-delete-a-value-in-list-of-python/2794519#2794519 (just replace `if item != value` with `if p.poll()`
J.F. Sebastian