I have two Python scripts which I am running on Windows with IronPython 2.6 on .NET 2.0. One outputs binary data and the other processes the data. I was hoping to be able to stream data from the first to the second using pipes. The problem I encountered here is that, when run from the Windows command-line, sys.stdout
uses CP437 character encoding and text mode instead of binary mode ('w'
instead of 'wb'
). This causes some bytes greater than 127 to be written as the wrong character (i.e., different byte values produce the same character in the output and are thus indistinguishable by the script reading them).
For example, this script prints the same character (an underscore) twice:
import sys
sys.stdout.write(chr(95))
sys.stdout.write(chr(222))
So when I try to read the data I get something different than what I originally wrote.
I wrote this script to check if the problem was writing in 'w'
mode or the encoding:
import sys
str = chr(222)
# try writing chr(222) in ASCII in both write modes
# ASCII is the default encoding
open('ascii_w', 'w').write(str)
open('ascii_wb', 'wb').write(str)
# set encoding to CP437 and try writing chr(222) in both modes
reload(sys)
sys.setdefaultencoding("cp437")
open('cp437_w', 'w').write(str)
open('cp437_wb', 'wb').write(str)
After running that, the file cp437_w
contains character 95 and the other three each contain character 222. Therefore, I believe that the problem is caused by the combination of CP437 encoding and writing in 'w'
mode. In this case it would be solved if I could force stdout
to use binary mode (I'm assuming that getting it to use ASCII encoding is impossible given that cmd.exe uses CP437). This is where I'm stuck; I can't find any way to do this.
Some potential solutions I found that didn't work:
- running
ipy -u
doesn't seem to have any effect (I also tested to see if it would cause Unix-style newlines to be printed; it doesn't, so I suspect that-u
doesn't work with IronPython at all) - I can't use this solution because
msvcrt
is not supported in IronPython - with Python 3.x you can access unbuffered
stdout
throughsys.stdout.buffer
; this isn't available in 2.6 os.fdopen(sys.stdout.fileno(), 'wb', 0)
just returnsstdout
in'w'
mode
So yeah, any ideas? Also, if there's a better way of streaming binary data that doesn't use stdout
, I'm certainly open to suggestions.