ansaurus

Question

django,fastcgi: how to manage a long running process?

Answer 1

+3 A:

Maybe you could look at the problem the other way around.

Maybe you could try DjangoQueueService, and have a "daemon" listening to the queue, seeing if there's something new and process it.

changelog 2008-10-20 18:13:38

That's definitely close to what I'm looking for. I stumbled upon that earlier but I'm hoping to find a solution that doesn't require me to add any additional dependencies. Thanks.

2008-10-20 18:17:42

You can roll a queue system of your own then. I mean, it's not very difficult to do.

changelog 2008-10-20 23:20:30

as the creator of Django Queue Service, I'd say look towards celery or one of those queuing services instead. It was a neat hack at the day, but has easily been surpassed now.

heckj 2010-07-05 23:12:20

Answer 2

+1 A:

I have to solve a similar problem now. It is not going to be a public site, but similarly, an internal server with low traffic.

Technical constraints:

all input data to the long running process can be supplied on its start
long running process does not require user interaction (except for the initial input to start a process)
the time of the computation is long enough so that the results cannot be served to the client in an immediate HTTP response
some sort of feedback (sort of progress bar) from the long running process is required.

Hence, we need at least two web “views”: one to initiate the long running process, and the other, to monitor its status/collect the results.

We also need some sort of interprocess communication: send user data from the initiator (the web server on http request) to the long running process, and then send its results to the reciever (again web server, driven by http requests). The former is easy, the latter is less obvious. Unlike in normal unix programming, the receiver is not known initially. The receiver may be a different process from the initiator, and it may start when the long running job is still in progress or is already finished. So the pipes do not work and we need some permamence of the results of the long running process.

I see two possible solutions:

dispatch launches of the long running processes to the long running job manager (this is probably what the above-mentioned django-queue-service is);
save the results permanently, either in a file or in DB.

I preferred to use temporary files and to remember their locaiton in the session data. I don't think it can be made more simple.

A job script (this is the long running process), myjob.py:

import sys
from time import sleep

i = 0
while i < 1000:
    print 'myjob:', i  
    i=i+1
    sleep(0.1)
    sys.stdout.flush()

django urls.py mapping:

urlpatterns = patterns('',
(r'^startjob/$', 'mysite.myapp.views.startjob'),
(r'^showjob/$',  'mysite.myapp.views.showjob'),
(r'^rmjob/$',    'mysite.myapp.views.rmjob'),
)

django views:

from tempfile import mkstemp
from os import fdopen,unlink,kill
from subprocess import Popen
import signal

def startjob(request):
     """Start a new long running process unless already started."""
     if not request.session.has_key('job'):
          # create a temporary file to save the resuls
          outfd,outname=mkstemp()
          request.session['jobfile']=outname
          outfile=fdopen(outfd,'a+')
          proc=Popen("python myjob.py",shell=True,stdout=outfile)
          # remember pid to terminate the job later
          request.session['job']=proc.pid
     return HttpResponse('A <a href="/showjob/">new job</a> has started.')

def showjob(request):
     """Show the last result of the running job."""
     if not request.session.has_key('job'):
          return HttpResponse('Not running a job.'+\
               '<a href="/startjob/">Start a new one?</a>')
     else:
          filename=request.session['jobfile']
          results=open(filename)
          lines=results.readlines()
          try:
               return HttpResponse(lines[-1]+\
                         '<p><a href="/rmjob/">Terminate?</a>')
          except:
               return HttpResponse('No results yet.'+\
                         '<p><a href="/rmjob/">Terminate?</a>')
     return response

def rmjob(request):
     """Terminate the runining job."""
     if request.session.has_key('job'):
          job=request.session['job']
          filename=request.session['jobfile']
          try:
               kill(job,signal.SIGKILL) # unix only
               unlink(filename)
          except OSError, e:
               pass # probably the job has finished already
          del request.session['job']
          del request.session['jobfile']
     return HttpResponseRedirect('/startjob/') # start a new one

jetxee 2008-12-29 15:51:23

Answer 3

A:

This isn't working for me because the process that creates the child process seems to wait until the child process finishes before it returns. I cant get it to just run in the background.

2009-01-30 22:01:38

ansaurus

tags:

views:

answers:

django,fastcgi: how to manage a long running process?

related questions