I have to solve a similar problem now. It is not going to be a public site, but similarly, an internal server with low traffic.
Technical constraints:
- all input data to the long running process can be supplied on its start
- long running process does not require user interaction (except for the initial input to start a process)
- the time of the computation is long enough so that the results cannot be served to the client in an immediate HTTP response
- some sort of feedback (sort of progress bar) from the long running process is required.
Hence, we need at least two web “views”: one to initiate the long running process, and the other, to monitor its status/collect the results.
We also need some sort of interprocess communication: send user data from the initiator (the web server on http request) to the long running process, and then send its results to the reciever (again web server, driven by http requests). The former is easy, the latter is less obvious. Unlike in normal unix programming, the receiver is not known initially. The receiver may be a different process from the initiator, and it may start when the long running job is still in progress or is already finished. So the pipes do not work and we need some permamence of the results of the long running process.
I see two possible solutions:
- dispatch launches of the long running processes to the long running job manager (this is probably what the above-mentioned django-queue-service is);
- save the results permanently, either in a file or in DB.
I preferred to use temporary files and to remember their locaiton in the session data. I don't think it can be made more simple.
A job script (this is the long running process), myjob.py
:
import sys
from time import sleep
i = 0
while i < 1000:
print 'myjob:', i
i=i+1
sleep(0.1)
sys.stdout.flush()
django urls.py
mapping:
urlpatterns = patterns('',
(r'^startjob/$', 'mysite.myapp.views.startjob'),
(r'^showjob/$', 'mysite.myapp.views.showjob'),
(r'^rmjob/$', 'mysite.myapp.views.rmjob'),
)
django views:
from tempfile import mkstemp
from os import fdopen,unlink,kill
from subprocess import Popen
import signal
def startjob(request):
"""Start a new long running process unless already started."""
if not request.session.has_key('job'):
# create a temporary file to save the resuls
outfd,outname=mkstemp()
request.session['jobfile']=outname
outfile=fdopen(outfd,'a+')
proc=Popen("python myjob.py",shell=True,stdout=outfile)
# remember pid to terminate the job later
request.session['job']=proc.pid
return HttpResponse('A <a href="/showjob/">new job</a> has started.')
def showjob(request):
"""Show the last result of the running job."""
if not request.session.has_key('job'):
return HttpResponse('Not running a job.'+\
'<a href="/startjob/">Start a new one?</a>')
else:
filename=request.session['jobfile']
results=open(filename)
lines=results.readlines()
try:
return HttpResponse(lines[-1]+\
'<p><a href="/rmjob/">Terminate?</a>')
except:
return HttpResponse('No results yet.'+\
'<p><a href="/rmjob/">Terminate?</a>')
return response
def rmjob(request):
"""Terminate the runining job."""
if request.session.has_key('job'):
job=request.session['job']
filename=request.session['jobfile']
try:
kill(job,signal.SIGKILL) # unix only
unlink(filename)
except OSError, e:
pass # probably the job has finished already
del request.session['job']
del request.session['jobfile']
return HttpResponseRedirect('/startjob/') # start a new one