views:

1004

answers:

2

hello, I have a general question about popen (and all related functions), applicable to all operating systems, when I write a python script or some c code and run the resulting executable from the console (win or linux), i can immediately see the output from the process. However, if I run the same executable as a forked process with its stdout redirected into a pipe, the output buffers somewhere, usually up to 4096 bytes before it is written to the pipe where the parent process can read it.

The following python script will generate output in chunks of 1024 bytes

import os, sys, time

if __name__ == "__main__":
     dye = '@'*1024
     for i in range (0,8):
        print dye
        time.sleep(1)

The following python script will execute the previous script and read the output as soon as it comes to the pipe, byte by byte

import os, sys, subprocess, time, thread

if __name__ == "__main__":
    execArgs = ["c:\\python25\\python.exe", "C:\\Scripts\\PythonScratch\\byte_stream.py"]

    p = subprocess.Popen(execArgs, bufsize=0, stdout=subprocess.PIPE)
    while p.returncode == None:
        data = p.stdout.read(1)
        sys.stdout.write(data)
        p.poll()

Adjust the path for your operating system. When run in this configuration, the output will not appear in chunks of 1024 but chunks of 4096, despite the buffer size of the popen command being set to 0 (which is the default anyway). Can anyone tell me how to change this behaviour?, is there any way I can force the operating system to treat the output from the forked process in the same way as when it is run from the console?, ie, just feed the data through without buffering?

+1  A: 

Thats correct, and applies to both Windows and Linux (and possibly other systems), with popen() and fopen(). If you want the output buffer to be dispatched before 4096 bytes, use fflush() (on C) or sys.stdout.flush() (Python).

Havenard
yes, this is what I'm doing at the moment, but my working situation means that the process generating the output is user defined, I should've mentioned that in the initial question
Gearoid Murphy
+2  A: 

In general, the standard C runtime library (that's running on behalf of just about every program on every system, more or less;-) detects whether stdout is a terminal or not; if not, it buffers the output (which can be a huge efficiency win, compared to unbuffered output).

If you're in control of the program that's doing the writing, you can (as another answer suggested) flush stdout continuously, or (more elegantly if feasible) try to force stdout to be unbuffered, e.g. by running Python with the -u commandline flag:

-u     : unbuffered binary stdout and stderr (also PYTHONUNBUFFERED=x)
         see man page for details on internal buffering relating to '-u'

(what the man page adds is a mention of stdin and issues with binary mode[s]).

If you can't or don't want to touch the program that's writing, -u or the like on the program that's just reading is unlikely to help (the buffering that matters most is the one happening on the writer's stdout, not the one on the reader's stdin). The alternative is to trick the writer into believing that it's writing to a terminal (even though in fact it's writing to another program!), via the pty standard library module or the higher-level third party pexpect module (or, for Windows, its port wexpect).

Alex Martelli
I tried playing around with -u, no joy but pexpect seems promising, thanks!
Gearoid Murphy
just to follow up, pexpect works like a charm, wexpect is a little buggy (and hard to find) but gets the job done. This is where I found the most recent version of wexpect : http://sage.math.washington.edu/home/goreckc/sage/wexpect/
Gearoid Murphy
Thanks Gearoid, peculiar indeed that wexpect has not been updated to the latest version on its code.google.com home, I wonder why!
Alex Martelli
Alternatively, one could use the real time console app, described here: http://www.codeproject.com/KB/threads/RTconsole.aspx
Gearoid Murphy