tags:

views:

172

answers:

4

I am writing a cgi page in Python. Let's say a client sends request to my cgi page. My cgi page does the calculation and as soon as it has the first output, it sends back that output to the client, but it will CONTINUE to do the calculation and send other responses AFTER the first response is sent.

Is what I have presented here possible? I ask this question because in my limited knowledge, in a cgi page responses are sent back on one-time basic, once a response is sent, cgi-page stops running. This thing is made on server side or client side, and how do I implement it?

My server is running Apache. Thank you very much.

I have tried a client code from "dbr" in this forum (thanks to him I got the idea of how long-polling works).

<html>
<head>
    <title>BargePoller</title>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.2.6/jquery.min.js" type="text/javascript" charset="utf-8"></script>

    <style type="text/css" media="screen">
      body{ background:#000;color:#fff;font-size:.9em; }
      .msg{ background:#aaa;padding:.2em; border-bottom:1px #000 solid}
      .old{ background-color:#246499;}
      .new{ background-color:#3B9957;}
    .error{ background-color:#992E36;}
    </style>

    <script type="text/javascript" charset="utf-8">
    function addmsg(type, msg){
        /* Simple helper to add a div.
        type is the name of a CSS class (old/new/error).
        msg is the contents of the div */
        $("#messages").append(
            "<div class='msg "+ type +"'>"+ msg +"</div>"
        );
    }

    function waitForMsg(){
        /* This requests the url "msgsrv.php"
        When it complete (or errors)*/
        $.ajax({
            type: "GET",
            url: "msgsrv.php",

            async: true, /* If set to non-async, browser shows page as "Loading.."*/
            cache: false,
            timeout:50000, /* Timeout in ms */

            success: function(data){ /* called when request to barge.php completes */
                addmsg("new", data); /* Add response to a .msg div (with the "new" class)*/
                setTimeout(
                    'waitForMsg()', /* Request next message */
                    1000 /* ..after 1 seconds */
                );
            },
            error: function(XMLHttpRequest, textStatus, errorThrown){
                addmsg("error", textStatus + " (" + errorThrown + ")");
                setTimeout(
                    'waitForMsg()', /* Try again after.. */
                    "15000"); /* milliseconds (15seconds) */
            },
        });
    };

    $(document).ready(function(){
        waitForMsg(); /* Start the inital request */
    });
    </script>
</head>
<body>
    <div id="messages">
        <div class="msg old">
            BargePoll message requester!
        </div>
    </div>
</body>
</html>

And here is my server code:

import sys
if __name__ == "__main__":
    sys.stdout.write("Content-Type: text/html\r\n\r\n")
    print "<html><body>"
    for i in range(10):
        print "<div>%s</div>" % i
        sys.stdout.flush()
    print "</body></html>"

I am expecting my client page to display 1 number at a time (0,1,2,...), but the data always comes out all at once (01234...). Please help me figure it out. Thanks you guys so much.

Just a little out-track, I am trying to use jquery comet plugin, but I couldn't find sufficient documentation though. Helps would be much appreciated. Thanks again :D

[edit] Ok guys, finally thanks to your guides I have managed to make it work. You're right when predict that mod_deflate is the source of all this.

To sum up, what I have done here:

  • For client, make a long poll page as the html code above

  • For server, disable the mod_deflate by: editing file /etc/apache2/mods-available/deflate.conf, comment out the line with text/html part and restart the server. To ensure that Python doesn't buffer the output itself, include #!/usr/bin/python -u in the beginning of the page. Remember to use sys.stdout.flush() after each printing that you want to appear at the client. The effect may not be transparent, should include time.sleep(1) to test. :D

Thanks you guys very much for supporting and helping solving this :D

+1  A: 

Yes thats possible and you don't have do much, as you print data out, server will send it, just to be sure keep flushing stdout

Anurag Uniyal
Thanks for the answer. But I am not sure it's possible this way. If you just print data out, the response can only be sent after the script has completed running can't it?
as other answers have explained in details, you have to do just print our and flush as i mentioned
Anurag Uniyal
I keep flushing the stdout as you have mentioned here, but one it comes to the client page, it keeps displaying a whole chunk of data all at once. Could not understand. :(
+1  A: 

There are a few techniques.

The old-fashioned way is to continue to stream data, and have the browser continue to render it using progressive rendering. So as an old-fashioned CGI, just do sys.stdout.flush(). This shows a partial page that you can keep adding to, but it looks clumsy in the browser because the throbber will keep spinning and it looks much like the server is hung or overloaded.

Some browsers support a special multipart mimetype multipart/x-mixed-replace that allows you to do the same trick of keeping the connection open, but the browser will replace the page completely when you send the next multipart chunk (which must be MIME-formatted). I don't know if that's usable - Internet Explorer doesn't support it and it may not work well in other browser either.

The next most modern way is polling the server for results with Javascript's XMLHttpRequest. This requires that you can check the results of the operation from a different webserver thread or process, which can be quite a bit more difficult to achieve in the server-side code. It allows you to create a much nicer web page though.

If you want to get even more complicated, check out the "Comet" model or "Web Sockets".

Mauve
Thanks for the answer. I'm not sure if I can understand it :D. I'd be very appreciated if you can give me some examples, especially the first part using sys.stdout.flush()...
I have done as you said using sys.stdout.flush() function but somehow all the data result at one and I can not get it. Am I missing something here?
+1  A: 

The trick in old-fashioned CGI programs is using the Transfer-Encoding: chunked HTTP header:

3.6.1 Chunked Transfer Coding

The chunked encoding modifies the body of a message in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer containing entity-header fields. This allows dynamically produced content to be transferred along with the information necessary for the recipient to verify that it has received the full message.

When a result is available, send it as a separate chunk - the browser will display this self-contained HTTP message. When another chunk arrives later, a NEW PAGE is displayed.

You'll have to produce the correct headers for each chunk inside the CGI program. Also, remember to flush the CGI output at the end of each chunk. In Python this is done with sys.stdout.flush()

gimel
Chunked transfer-coding is what's going on behind the scenes, but you don't have to generate the headers yourself, the web server will do it for you.
bobince
I tried out this trick but sadly it doesn't work.
+3  A: 

Sure.

There's traditional server-driven approach, where the script runs just once, but takes a long time to complete, spitting out bits of page as it goes:

import sys, time

sys.stdout.write('Content-Type: text/html;charset=utf-8\r\n\r\n')

print '<html><body>'
for i in range(10):
    print '<div>%i</div>'%i
    sys.stdout.flush()
    time.sleep(1)

When writing an app to WSGI, this is done by having the application return an iterable which outputs each block it wants sent separately one at a time. I'd really recommend writing to WSGI; you can deploy it through CGI now, but in the future when your app needs better performance you can deploy it through a faster server/interface without having to rewrite.

WSGI-over-CGI example:

import time, wsgiref.handlers

class MyApplication(object):
    def __call__(self, environ, start_response):
        start_response('200 OK', [('Content-Type', 'text/html;charset=utf-8')])
        return self.page()

    def page(self):
        yield '<html><body>'
        for i in range(10):
            yield '<div>%i</div>'%i
            time.sleep(1)

application= MyApplication()
if __name__=='__main__':
    wsgiref.handlers.CGIHandler().run(application)

Note that your web server may foil this approach (for CGI or WSGI) by adding buffering of its own. This typically happens if you're using output-transforming filters like mod_deflate to automatically compress webapp output. You'll need to turn compression off for partial-response-generating scripts.

This limits you to rendering the page bit-by-bit as new data comes in. You can make it prettier by having the client-side take care of altering the page as new data comes in, eg.:

def page(self):
    yield (
        '<html><body><div id="counter">-</div>'
        '<script type="text/javascript">'
        '    function update(n) {'
        '        document.getElementById("counter").firstChild.data= n;'
        '    }'
        '</script>'
    )
    for i in range(10):
        yield '<script type="text/javascript">update(%i);</script>'%i
        time.sleep(1)

This relies on client-side scripting so it might be a good idea to include backup non-script-based final output at the end.

All the while doing this, the page will appear to be still loading. If you don't want that, then you'd need to split the script into a first request that just spits out the static content, including client-side script that checks back with the server using either one XMLHttpRequest that it polls for new data through, or, for the really long-running cases, many XMLHttpRequests each of which returns the status and any new data. This approach is much more complicated as it means you have to run your work process as a background daemon process apart from the web server, and pass data between the daemon and the front-end CGI/WSGI request using eg. pipes or a database.

bobince
Impressively detailed answer. Thanks very much for you good work :D. I will try try this as soon as possible.
I've just test it using a simple page with Jquery-supported. It appears that all the results come and one, not sequentially. What's am I missing here?
Perhaps you have a filter at the server side doing extra buffering. The one I had that stopped it working was mod_deflate as noted above, but other filters might have the same effect.
bobince
how would I turn it off? If I am not able to turn it off, then is there anyway to get around it? I am doing some reading on comet, but basically comet+jquery is not much of documentation, so I found it quite challenging to read to codes provided :(
If there's a filter, you'd need to faff with your server-side config. eg. for mod_deflate on Apache you'd have to change the SetOutputFilter or AddOutputFilterByType directive in use. I'm not sure about what jQuery provides for reading partial responses... you'd first try to access the page in a straight browser to see if it renders bit-by-bit before worrying about AJAX.
bobince
I have tested using my server page solely and it displayed all the data at once. :D