views:

16

answers:

1

Hello,

I'm running a pipe of commands from a python3 program, using subprocess.*; I didn't want to go trough a shell, for I'm passing arguments to my subcommands, and making sure these would not be misinterpreted by the shell would be nightmarish.

The subprocess doc gives this example of how to do it:

p1 = Popen(command1, stdout=PIPE)
p2 = Popen(command2, stdin=p1.stdout)
p2.wait()
p1.wait()

This works well. However, I wondered if it would be safer to start the consumer before the producer, so

p2 = Popen(command2, stdin=PIPE)
p1 = Popen(command1, stdout=p2.stdin)
p2.wait()
p1.wait()

I expected this to behave in exactly the same way, but apparently they do not. The first code works flawlessly; for the second, my program hangs; If I look at the system, I can see that p1 is dead and waiting to be reaped, and p2 hangs forever. Is there a rational explanation for that ?

+1  A: 

It looks like p2 (consumer) is hanging because its stdin remains open. If the code is modified like this, both processes finish successfully:

p2 = Popen(command2, stdin=PIPE)
p1 = Popen(command1, stdout=p2.stdin)
p1.wait()
p2.stdin.close()
p2.wait()

I bet this is the Law of Leaky Abstractions in action.

Constantin
Wow, looks like you are right. But I still don't understand why this is required. Where does the difference of behavior come ? after the subprocesses call exec(), what they do is outside the control of the parent, right ? could the GC of the parent be involved ?
b0fh
@b0fh, that is not clear to me either.
Constantin

related questions