tags:

views:

271

answers:

4

In the python program I'm writing, I've got a thread which iterates over a large structure in memory and writes it incrementally into a file-like object. I've got another thread which takes a file-like object and writes it to disk. Is there an easy way to connect the two, such that any data input from the first thread will be buffered for the second?

Specifically, I'm trying to pass data to subprocess.Popen(). The process will read from stdin, but you cannot pass a "file-like" object to Popen because it calls stdin.fileno() and blows up unless you have a real file.

Instead, you need to pass the PIPE argument to Popen, which allows you to use proc.stdin as a file-like object. But if you've already got a file-like object, there doesn't seem to be a great way to yolk the two of them together.

+3  A: 

You should use the Queue module for sharing sequential data across threads. You would have to make a file-like Queue subclass, where .read and .write mutually block each other, with a buffer in-between.

OTOH, I wonder why the first thread can't write to the real file in the first place.

Martin v. Löwis
I did use the Queue module, and wrote a class to do this. And after doing it, I thought it seemed like a pretty common problem, and that there was probably a library to do it already.I can't write to the file directly because I'm using the subprocess module; that reads stdin from a file object.
Chris B.
A: 

I'm not clear what you're trying to do ehre. This sounds like a job for a regular old pipe, which is a file-like object. I'm guessing, however, that you mean you're got a stream of some other sort.

It also sounds a lot like what you want is a python Queue, or maybe a tempfile.

Charlie Martin
+1  A: 

I think there is something wrong in the design if you already have a file-like object if you want your data to end up in the subprocess. You should then arrange that they get written into the subprocess in the first place, rather than having them written into something else file-like first. Whoever is writing the data should allow the flexibility to specify the output stream, and that should be the subprocess pipe.

Alternatively, if the writer insists on creating its own stream object, you should let it complete writing, and only then start the subprocess, feeding it from the result of first write. E.g. if it is a StringIO object, take its value after writing, and write it into the pipe; no need for thread synchronization here.

Martin v. Löwis
+2  A: 

Use shutil's copyfileobj() function:

import shutil
import subprocess

proc = subprocess.Popen([...], stdin=subprocess.PIPE)

my_input = get_filelike_object('from a place not given in the question')

shutil.copyfileobj(my_input, proc.stdin)

No need to use threads.

nosklo