views:

77

answers:

1

Hi Everyone

I'm wrestling with the concepts behind subprocesses and pipes, and working with them in a Python context. If anybody could shed some light on these questions it would really help me out.

  1. Say I have a pipeline set up as follows

    createText.py | processText.py | cat

    processText.py is receiving data through stdin, but how is this implemented? How does it know that no more data will be coming and that it should exit? My guess is that it could look for an EOF and terminate based on that, but what if createText.py never sends one? Would that be considered an error on createText.py's part?

  2. Say parent.py starts a child subprocess (child.py) and calls wait() to wait for the child to complete. If parent is capturing child's stdout and stderr as pipes, is it still safe to read from them after child has terminated? Or are the pipes (and data in them) destroyed when one end terminates?

  3. The general problem that I want to solve is to write a python script that calls rsync several times with the Popen class. I want my program to wait until rsync has completed, then I want to check the return status to see if it exited correctly. If it didn't, I want to read the child's stderr to see what the error was. Here is what I have so far

    # makes the rsync call.  Will block until the child
    # process is finished.  Returns the exit code for rsync
    def performRsync(src, dest):
        print "Pushing " + src + " to " + dest
        child = Popen(['rsync', '-av', src, dest], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        child.wait()    
        ## check for success or failure
        ## 0 is a successful exit code here
        if not child.returncode:
            return True 
        else:#ballz
            stout, sterr = child.communicate()
            print "ERR pushing " + src + ". " + sterr
            return False
    
  4. Update: I also came across this problem. Consider these two simple files:

    # createText.py
    for x in range(1000):
        print "creating line " + str(x)
        time.sleep(1)
    
    
    # processText.py
    while True:
        line = sys.stdin.readline()
        if not line:
            break;
        print "I modified " + line
    

    Why does processText.py in this case not start printing as it gets data from stdin? Does a pipe collect some amount of buffered data before it passes it along?

A: 

This assumes a UNIXish/POSIXish environment.

EOF in a pipeline is signaled by no more data to read, that is, read() returns a length of 0. This normally occurs when the left-hand process exits and closes its stdout. Since you can't read from a pipe whose other end is closed the read in processText indicates EOF.

If createText were to not exit thus closing its output it would be a non-ending program which in a pipeline is a Bad Thing. Even if not in pipeline, a program that never ends usually incorrect (odd cases like yes(1) excepted).

You can read from a pipe as long as you don't get EOF or an IOError(errno.EPIPE) indication which would also indicate there is nothing left to read.

I've not tested your code, does it do something unexpected?

msw