views:

169

answers:

1

For an application I'm developing, the user submits a gzipped HTTP POST request (content-encoding: GZIP) with multipart form data (content-type: multipart/form-data). I use mod_deflate as an input filter to decompress and the web request is processed in Django via mod_wsgi.

Generally, everything is fine. But for certain requests (deterministic), there is almost a minute lag from request to response. Investigation shows that the processing in django is done immediately, but the response from the server stalls. If the request is not GZIPed, all works well.

Note that to deal with a glitch in mod_wsgi, I set content-length to the uncompressed mesage size.

Has anyone run into this problem? Is there a way to easily debug apache as it processes responses?

+2  A: 

What glitch do you believe exists in mod_wsgi?

The simple fact of the matter is that WSGI 1.0 doesn't support mutating input filters which change the content length of the request content. Thus technically you can't use mod_deflate in Apache for request content when using WSGI 1.0. Your setting the content length to be a value other than the actual size is most likely going to stuff up operation of mod_deflate.

If you want to be able to handle compressed request content you need to step outside of WSGI 1.0 specification and use non standard code.

I suggest you have a read of:

http://blog.dscpl.com.au/2009/10/details-on-wsgi-10-amendmentsclarificat.html

This explains this problem and the suggestions about it.

I'd very much suggest you take this issue over to the official mod_wsgi mailing list for discussion about how you need to write your code. If though you are using one of the Python frameworks however, you are probably going to be restricted in what you can do as they will implement WSGI 1.0 where you can't do this.


UPDATE 1

From discussion on mod_wsgi list, the original WSGI application should be wrapped in following WSGI middleware. This will only work on WSGI adapters that actually provide an empty string as end sentinel for input, something which WSGI 1.0 doesn't require. This should possibly only be used for small uploads as everything is read into memory. If need large compressed uploads, then data when accumulated should be written out to a file instead.

class Wrapper:

    def __init__(self, application):
        self.__application = application

    def __call__(self, environ, start_response):
        if environ.get('HTTP_CONTENT_ENCODING', '') == 'gzip':
            buffer = cStringIO.StringIO()
            input = environ['wsgi.input']
            blksize = 8192
            length = 0

            data = input.read(blksize)
            buffer.write(data)
            length += len(data)

            while data:
                data = input.read(blksize)
                buffer.write(data)
                length += len(data)

            buffer = cStringIO.StringIO(buffer.getvalue())

            environ['wsgi.input'] = buffer
            environ['CONTENT_LENGTH'] = length

        return self.__application(environ, start_response)


application = Wrapper(original_wsgi_application_callable)
Graham Dumpleton
As for the glitch, we actually discussed this previously here:http://code.djangoproject.com/ticket/10819#comment:1I took your comment to mean that I should just set content-length to the uncompressed message size. Doing that has worked fine until now...Either way, I asked on the mod_wsgi list. Thanks for your help.
UsAaR33
The discussion on mod_wsgi list about this is at 'http://groups.google.com/group/modwsgi/browse_frm/thread/54eba8ddff1a8eec'.
Graham Dumpleton