tags:

views:

431

answers:

2

I am sending an AJAX request to a Django view that can potentially take a lot of time. It goes through some well-defined steps, however, so I would like to print status indicators to the user letting it know when it is finished doing a certain thing and has moved on to the next.

If I was using PHP it might look like this, using the flush function:

do_something();
print 'Done doing something!';
flush();

do_something_else();
print 'Done doing something else!';
flush();

How would I go about doing the same with Django? Looking at the documentation I see that HttpResponse objects have a flush method, but all it has to say is that "This method makes an HttpResponse instance a file-like object." - I'm not sure that's what I want. I'm having a hard time wrapping my head around how this could be done in Django since I have to return the response and don't really have a control of when the content goes to the browser.

+5  A: 

Most webservers (eg. FCGI/SCGI) do their own buffering, HTTP clients do their own, and so on. It's very difficult to actually get data flushed out in this way and for the client to actually receive it, because it's not a typical operation.

The closest to what you're trying to do would be to pass an iterator to HttpResponse, and to do the work in a generator; something like this:

def index(request):
    def do_work():
        step_1()
        yield "step 1 complete"
        step_2()
        yield "step 2 complete"
        step_3()
        yield "step 3 complete"
    return HttpResponse(do_work())

... but this won't necessarily flush. (Not tested code, but you get the idea; see http://docs.djangoproject.com/en/dev/ref/request-response/#passing-iterators.)

Most of the infrastructure is simply not expecting a piecemeal response. Even if Django isn't buffering, your front-end server might be, and the client probably is, too. That's why most things use pull updates for this: a separate interface to query the status of a long-running request.

(I'd like to be able to do reliable push updates for this sort of thing, too...)

Glenn Maynard
Thanks. I had tried generators but stupidly enough I was yielding integers in my test and it made it not work. I'm probably going to end up not doing this at all but it's nice to know it at least worked, albeit with the limitations you mentioned.
Paolo Bergantino
+3  A: 

I'm not sure you need to use the flush() function.

Your AJAX request should just go to a django view.

If your steps can be broken down, keep it simple and create a view for each step. That way one one process completes you can update the user and start the next request via AJAX.

views.py

def do_something(request):
    # stuff here
    return HttpResponse()

def do_something_else(request):
    # more stuff
    return HttpResponse()
monkut
In many cases this isn't wanted, though. If the whole operation is a unit, then it's adding unnecessary delay between each phase starting (waiting for the client to kick it off); this is worse if you want to do finer updates (eg. percent complete). The whole operation may want to be run in a single database transaction; even if it doesn't, you may want to clean up if it the whole operation isn't completed, which implies you also need a timeout to abort if the client disappears. Of course, it depends on what you're doing.
Glenn Maynard
I agree, it depends on the structure and expectations of the result.
monkut
I considered splitting it up, but for the reasons mentioned above and then some I was really hoping to avoid it. All the steps are 100% independent, though, but isn't there a 2 active request limit per domain or something like that?
Paolo Bergantino
Not unless you configure one in your front-end. If you have a connection limit, you have that limit whether the job is being done under a single request or several. If your Django backends are processes and not threads, though, you're tying up an entire backend while this work is going on (no matter which approach you use). Using threads can fix that, by making backends cheap, but not everyone will want to deal with threaded backends (not a big deal but it does introduce another class of issues).
Glenn Maynard
The only way to avoid it entirely is to have a separate process to handle jobs like this, so it's a separately controllable resource. One side-effect of this is that no HTTP connection is held open during the job; that probably only matters if you have thousands of these jobs at once. Another side-effect: if these are resource-intensive tasks, it lets you control how many happen at once. If it's copying large files and running several of these concurrently would simply thrash the disk, you may want jobs to be queued and only run one or two at a time.
Glenn Maynard
I'm not talking about the back-end, don't browsers limit how many open connections it can have to a particular domain? I am pretty sure they do. If I tried opening 6-7 ajax requests it would then queue them until the rest open up.
Paolo Bergantino