views:

873

answers:

2

I have read tons of posts but still can't seem to figure it out.

I want to subprocess.Popen() rsync.exe in windows, and print the stdout in python.

My code works, but it doesn't catch the progress until a file is done transfered! I want to print the progress for each file in realtime.

Using python 3.1 now since I heard it should be better at handling IO.

import subprocess, time, os, sys

cmd = "rsync.exe -vaz -P source/ dest/"
p, line = True, 'start'


p = subprocess.Popen(cmd,
                     shell=True,
                     bufsize=64,
                     stdin=subprocess.PIPE,
                     stderr=subprocess.PIPE,
                     stdout=subprocess.PIPE)

for line in p.stdout:
    print(">>> " + str(line.rstrip()))
    p.stdout.flush()
A: 

Change the stdout from the rsync process to be unbuffered.

p = subprocess.Popen(cmd,
                     shell=True,
                     bufsize=0,  # 0=unbuffered, 1=line-buffered, else buffer-size
                     stdin=subprocess.PIPE,
                     stderr=subprocess.PIPE,
                     stdout=subprocess.PIPE)
Will
Buffering happens on the rsync side, changing bufsize attribute on python side won't help.
nosklo
+2  A: 

Some rules of thumb for subprocess.

  • Never use shell=True. It needlessy invokes an extra shell process to call your program.
  • When calling processes, arguments are passed around as lists. sys.argv in python is a list, and so is argv in C. So you pass a list to Popen to call subprocesses, not a string.
  • Don't redirect stderr to a PIPE when you're not reading it.
  • Don't redirect stdin when you're not writing to it.

Example:

import subprocess, time, os, sys
cmd = ["rsync.exe", "-vaz", "-P", "source/" ,"dest/"]

p = subprocess.Popen(cmd,
                     stdout=subprocess.PIPE,
                     stderr=subprocess.STDOUT)

for line in p.stdout:
    print(">>> " + line.rstrip())

That said, it is very probable that rsync buffers its output when it detects that it is connected to a pipe instead of a terminal. It is in fact the default - when connected to a pipe, programs must explicity flush stdout for realtime results, otherwise standard C library will buffer.

To test for that, try running this instead:

cmd = [sys.executable, 'test_out.py']

and create a test_out.py file with the contents:

import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")

Executing that subprocess should give you "Hello" and wait 10 seconds before giving "World". If that happens with the python code above and not with rsync, that means rsync itself is buffering output, so you are out of luck.

A solution would be to connect direct to a pty, using soemthing like pexpect.

nosklo
`shell=False` is right thing when you construct command line especially from user entered data. But nevertheless `shell=True` is useful too when you get the whole command line from trusted source (e.g. hardcoded in the script).
Denis Otkidach
nosklo
nosklo,that should be:p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
Senthil Kumaran