ansaurus

Question

How to improve performance of python cgi that reads a big file and returns it as a download?

Answer 1

+1 A:

Try reading and outputting (i.e. buffering) a chunk of say 16KB at a time. Probably Python is doing something slow behind the scenes and manually buffering may be faster.

You shouldn't have to use e.g. a ramdisk - the OS disk cache ought to cache the file contents for you.

Andrew Medico 2009-09-22 20:24:06

Answer 2

+1 A:

mod_wsgi or FastCGI would help in the sense that you don't need to reload the Python interpreter every time your script is run. However, they'd do little to improve the performance of reading the file (if that's what's really your bottleneck). I'd advise you to use something along the lines of memcached instead.

oggy 2009-09-22 20:24:12

Answer 3

+1 A:

Why are you printing is all in one print statement? Python has to generate several temporary strings to handle the content headers and because of that last %s, it has to hold the entire contents of the file in two different string vars. This should be better.

print "Content-Type:application/x-download\nContent-Disposition:attachment;filename=%s\nContent-Length:%s\n\n" %    (os.path.split(FILENAME)[-1], len(buff))
print buff

You might also consider reading the file using the raw IO module so Python doesn't create temp buffers that you aren't using.

jmucchiello 2009-09-22 20:25:07

Answer 4

+7 A:

Use mod_wsgi and use something akin to:

def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    response_headers = [('Content-type', 'text/plain')]
    start_response(status, response_headers)

    file = open('/usr/share/dict/words', 'rb')
    return environ['wsgi.file_wrapper'](file)

In other words, use wsgi.file_wrapper extension of WSGI standard to allow Apache/mod_wsgi to perform optimised reply of file contents using sendfile/mmap. In other words, avoids your application even needing to read file into memory.

Graham Dumpleton 2009-09-22 23:53:29

ansaurus

tags:

views:

answers:

How to improve performance of python cgi that reads a big file and returns it as a download?

related questions