views:

206

answers:

6

A web application I am developing needs to perform tasks that are too long to be executed during the http request/response cycle. Typically, the user will perform the request, the server will take this request and, among other things, run some scripts to generate data (for example, render images with povray).

Of course, these tasks can take a long time, so the server should not hang for the scripts to complete execution before sending the response to the client. I therefore need to perform the execution of the scripts async, and give the client a "the resource is here, but not ready" and probably tell it a ajax endpoint to poll, so it can retrieve and display the resource when ready.

Now, my question is not relative to the design (although I would very much enjoy any hints on this regard as well). My question is: does a system to solve this issue already exists, so I do not reinvent the square wheel ? If I had to, I would use a process queue manager to submit the task and put a HTTP endpoint to shoot out the status, something like "pending", "aborted", "completed" to the ajax client, but if something similar already exists specifically for this task, I would mostly enjoy it.

I am working in python+django.

Edit: Please note that the main issue here is not how the server and the client must negotiate and exchange information about the status of the task.

The issue is how the server handles the submission and enqueue of very long tasks. In other words, I need a better system than having my server submit scripts on LSF. Not that it would not work, but I think it's a bit too much...

Edit 2: I added a bounty to see if I can get some other answer. I checked pyprocessing, but I cannot perform submission of a job and reconnect to the queue at a later stage.

+1  A: 

You can try two approachs:

  • To call webserver every n interval and inform a job id; server processes and return some information about current execution of that task
  • To implement a long running page, sending data every n interval; for client, that HTTP request will "always" be "loading" and it needs to collect new information every time a new data piece is received.

About second option, you can to learn more by reading about Comet; Using ASP.NET, you can do something similiar by implementing System.Web.IHttpAsyncHandler interface.

Rubens Farias
not my problem. my problem is to submit stuff on the server side from the server application. In other words Server <-1-> Web interface <-2-> clientAjax solves me 2, not 1.
Stefano Borini
Nope, I'm refining the question, just a sec
Stefano Borini
A: 

You can signal that a resource is being "worked on" by replying with a 202 HTTP code: the Client side will have to retry later to get the completed resource. Depending on the case, you might have to issue a "request id" in order to match a request with a response.

Alternatively, you could have a look at existing COMET libraries which might fill your needs more "out of the box". I am not sure if there are any that match your current Django design though.

jldupont
+1  A: 
Hugh Perkins
+1  A: 

At first You need some separate "worker" service, which will be started separately at powerup and communicated with http-request handlers via some local IPC like UNIX-socket(fast) or database(simple).

During handling request cgi ask from worker state or other data and replay to client.

vitaly.v.ch
+4  A: 

You should avoid re-inventing the wheel here.

Check out gearman. It has libraries in a lot of languages (including python) and is fairly popular. Not sure if anyone has any out of the box ways to easily connect up django to gearman and ajax calls, but it shouldn't be do complicated to do that part yourself.

The basic idea is that you run the gearman job server (or multiple job servers), have your web request queue up a job (like 'resize_photo') with some arguments (like '{photo_id: 1234}'). You queue this as a background task. You get a handle back. Your ajax request is then going to poll on that handle value until it's marked as complete.

Then you have a worker (or probably many) that is a separate python process connect up to this job server and registers itself for 'resize_photo' jobs, does the work and then marks it as complete.

I also found this blog post that does a pretty good job summarizing it's usage.

rhettg
Looks like it does exactly what I need. Thanks
Stefano Borini
Nice! Looks like a useful framework to use in the future. Cool!
Hugh Perkins
A: 

Probably not a great answer for the python/django solution you are working with, but we use Microsoft Message Queue for things just like this. It basically runs like this

  1. Website updates a database row somewhere with a "Processing" status
  2. Website sends a message to the MSMQ (this is a non blocking call so it returns control back to the website right away)
  3. Windows service (could be any program really) is "watching" the MSMQ and gets the message
  4. Windows service updates the database row with a "Finished" status.

That's the gist of it anyways. It's been quite reliable for us and really straight forward to scale and manage.

-al

Al W