views:

355

answers:

3

I have a Pylons controller action that needs to return a file to the client. (The file is outside the web root, so I can't just link directly to it.) The simplest way is, of course, this:

    with open(filepath, 'rb') as f:
        response.write(f.read())

That works, but it's obviously inefficient for large files. What's the best way to do this? I haven't been able to find any convenient methods in Pylons to stream the contents of the file. Do I really have to write the code to read a chunk at a time myself from scratch?

+5  A: 

The correct tool to use is shutil.copyfileobj, which copies from one to the other a chunk at a time.

Example usage:

import shutil
with open(filepath, 'r') as f:
    shutil.copyfileobj(f, response)

This will not result in very large memory usage, and does not require implementing the code yourself.

The usual care with exceptions should be taken - if you handle signals (such as SIGCHLD) you have to handle EINTR because the writes to response could be interrupted, and IOError/OSError can occur for various reasons when doing I/O.

Jerub
That's exactly what I was looking for - thanks!
Evgeny
Well, it SEEMED to work, but I tried it with a 2GB file recently and found that it still took a very long time to return anything and the memory usage of the process went to 2.5GB. So it appears that the Pylons response still buffers the whole file.
Evgeny
+1  A: 

The key here is that WSGI, and pylons by extension, work with iterable responses. So you should be able to write some code like (warning, untested code below!):

def file_streamer():
    with open(filepath, 'rb') as f:
        while True:
            block = f.read(4096)
            if not block:
                break
            yield block
response.app_iter = file_streamer()

Also, paste.fileapp.FileApp is designed to be able to return file data for you, so you can also try:

return FileApp(filepath)

in your controller method.

Chris AtLee
Sorry, this doesn't help. The `file_streamer` method returns the data, but it all still gets buffered. When I try to return `FileApp(filepath)` I get "TypeError: 'FileApp' object is not iterable"
Evgeny
Ah, looks like it just needs a little bit more code than that, but essentially `FileApp` does what I want. I'll post the complete answer separately. Thank you! +1
Evgeny
return forward(FileApp(filepath))
Marius Gedminas
A: 

I finally got it to work using the FileApp class, thanks to Chris AtLee and THC4k (from this answer). This method also allowed me to set the Content-Length header, something Pylons has a lot of trouble with, which enables the browser to show an estimate of the time remaining.

Here's the complete code:

def _send_file_response(self, filepath):
    user_filename = '_'.join(filepath.split('/')[-2:])
    file_size = os.path.getsize(filepath)

    headers = [('Content-Disposition', 'attachment; filename=\"' + user_filename + '\"'),
               ('Content-Type', 'text/plain'),
               ('Content-Length', str(file_size))]

    from paste.fileapp import FileApp
    fapp = FileApp(filepath, headers=headers)

    return fapp(request.environ, self.start_response)
Evgeny