views:

597

answers:

2

Is there any way to write binary output to sys.stdout in Python 2.x? In Python 3.x, you can just use sys.stdout.buffer (or detach stdout, etc...), but I haven't been able to find any solutions for Python 2.5/2.6.

EDIT, Solution: From ChristopheD's link, below:

import sys

if sys.platform == "win32":
    import os, msvcrt
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

EDIT: I'm trying to push a PDF file (in binary form) to stdout for serving up on a web server. When I try to write the file using sys.stdout.write, it adds all sorts of carriage returns to the binary stream that causes the PDF to render corrupt.

EDIT 2: For this project, I need to run on a Windows Server, unfortunately, so Linux solutions are out.

Simply Dummy Example (reading from a file on disk, instead of generating on the fly, just so we know that the generation code isn't the issue):

file = open('C:\\test.pdf','rb') 
pdfFile = file.read() 
sys.stdout.write(pdfFile)
+1  A: 

In Python 2.x, all strings are binary character arrays by default, so I believe you should be able to just

>>> sys.stdout.write(data)

EDIT: I've confirmed your experience.

I created one file, gen_bytes.py

import sys
for char in range(256):
    sys.stdout.write(chr(char))

And another read_bytes.py

import subprocess
import sys

proc = subprocess.Popen([sys.executable, 'gen_bytes.py'], stdout=subprocess.PIPE)
res = proc.wait()
bytes = proc.stdout.read()
if not len(bytes) == 256:
    print 'Received incorrect number of bytes: {0}'.format(len(bytes))
    raise SystemExit(1)
if not map(ord, bytes) == range(256):
    print 'Received incorrect bytes: {0}'.format(map(ord, bytes))
    raise SystemExit(2)
print "Everything checks out"

Put them in the same directory and run read_bytes.py. Sure enough, it appears as if Python is in fact converting newlines on output. I suspect this only happens on a Windows OS.

> .\read_bytes.py
Received incorrect number of bytes: 257

Following the lead by ChristopheD, and changing gen_bytes to the following corrects the issue.

import sys

if sys.platform == "win32":
    import os, msvcrt
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

for char in range(256):
    sys.stdout.write(chr(char))

I include this for completeness. ChristopheD deserves the credit.

Jason R. Coombs
This works if you're only trying to add string data, but python tries to stringify binary data when just calling write, corrupting the data.
Eavesdown
I ran your `gen_bytes.py` and `read_bytes.py` on Mac OS X (Python 2.5 with minor modifications for the missing "format" keywords) and it "Everything checks out"
Doug Harris
It looks like it's a Windows-only issue.
Eavesdown
On windows, I found that just running `gen_bytes.py > bytes.bin` I could see that the file was 257 bytes simply by doing a `dir`
gnibbler
Unless you're using powershell, in which case `gen_bytes.py > bytes.bin` generates a unicode-encoded file of 522 bytes.
Jason R. Coombs
+3  A: 

Which platform are you on?

You could try this recipe if you're on Windows (the link suggests it's Windows specific anyway).

There are some references on the web that there would/should be a function in Python 3.1 to reopen sys.stdout in binary mode but I don't really know if there's a better alternative then the above for Python 2.x.

ChristopheD
I did a test just reading the PDF in from a file and writing it straight back out, the carriage returns are still added.
Eavesdown
The windows solution link you give is the perfect solution. I can't thank you enough; this was driving me absolutely up the wall.
Eavesdown