views:

350

answers:

3

Right now I've got a mod_wsgi script that's structured like this..

def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    response_headers = [('Content-type', 'text/plain'),
                    ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

I was wondering if anyone knows of a way to change this to operate on a yield basis instead of return, that way I can send the page as it's being generated and not only once it's complete, so the page loading can go faster for the user.

However, whenever I swap the output for a list and yield it in the application(), it throws an error:

TypeError: sequence of string values expected, value of type list found
+5  A: 
def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    response_headers = [('Content-type', 'text/plain'),
                    ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    yield output

"However, whenever I swap the output for a list and yield it in the application(), it throws an error:"

Well, don't yield the list. Yield each element instead:

for part in mylist:
    yield part

or if the list is the entire content, just:

return mylist

Because the list is already an iterator and can yield by itself.

nosklo
Hah, yeah, I'm new to yield. I see my mistake now. :P
Ian
A: 

Don't send the content length and send the output as you derive it. You don't need to know the size of the output if you simply don't send the Content-Length header. That way can send part of the response before you have computed the rest of it.

def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    response_headers = [('Content-type', 'text/html')]
    start_response(status, response_headers)

    yield head()
    yield part1()
    yield part2()
    yield part3()
    yield "<!-- bye now! -->"

Otherwise you will get no benefit from sending in chunks, since computing the output is probably the slow part and the internet protocol will send the output in chunks anyway.

Sadly, this doesn't work in the case where, for example, the calculation of part2() decides you really need to change a header (like a cookie) or need to build other page-global data structures -- if this ever happens, you need to compute the entire output before sending the headers, and might as well use a return [output]

For example http://aaron.oirt.rutgers.edu/myapp/docs/W1200_1200.config_template Needs to build a page global data structure for the links to subsections that show at the top of the page -- so the last subsection must be rendered before the first chunk of output is delivered to the client.

Aaron Watters
+1  A: 

Note that 'yield' should be avoided unless absolutely necessary. In particular 'yield' will be inefficient if yielding lots of small strings. This is because the WSGI specification requires that after each string yielded that the response must be flushed. For Apache/mod_wsgi, flushing means each string being forced out through the Apache output bucket brigade and filter system and onto the socket. Ignoring the overhead of the Apache output filter system, writing lots of small strings onto a socket is simply just bad to begin with.

This problem also exists where an array of strings is returned from an application as a flush also has to be performed between each string in the array. This is because the string is dealt with as an iterable and not a list. Thus for a preformed list of strings, it is much better to join the individual strings into one large string and return a list containing just that one string. Doing this also allows a WSGI implementation to automatically generate a Content-Length for the response if one wasn't explicitly provided.

Just make sure that when joining all the strings in a list into one, that the result is returned in a list. If this isn't done and instead the string is returned, that string is treated as an iterable, where each element in the string is a single character string. This results in a flush being done after every character, which is going to be even worse than if the strings hadn't been joined.

Graham Dumpleton