ansaurus

Question

In Django, how to call a subprocess with a slow start-up time.

Answer 1

A:

How about "daemonizing" the subprocess call using python-daemon or its successor, grizzled.

John Paulett 2009-09-16 00:15:06

unrelated but, what does it do better? I've had some problems related to python-daemon, but I'm inherently skeptical towards collection libraries.

asksol 2009-09-18 17:08:59

Answer 2

+3 A:

It may seem like i am punting this product as this is the second time i have responded with a recommendation of this.

But it seems like you need a Message Queing service, in particular a distributed message queue.

ere is how it will work:

Your Django App requests CMD
CMD gets added to a queue
CMD gets pushed to several works
It is executed and results returned upstream

Most of this code exists, and you dont have to go about building your own system.

Have a look at Celery which was initially built with Django.

http://www.celeryq.org/ http://robertpogorzelski.com/blog/2009/09/10/rabbitmq-celery-and-django/

izzy 2009-09-16 08:48:24

This is interesting - I'll be looking into it. However, the problem I may (or may not) have is that part 1 of step #4 ("it is executed", i.e. LaTeX is started) must happen before #2 ("CMD gets added to queue", i.e. LaTeX gets data). However, I'm pretty confident that Celery can do this- but it'll require a bit of delving.

Brian M. Hunt 2009-09-21 17:39:08

Answer 3

+2 A:

Issy already mentioned Celery, but since comments doesn't work well with code samples, I'll reply as an answer instead.

You should try to use Celery synchronously with the AMQP result store. You could distribute the actual execution to another process or even another machine. Executing synchronously in celery is easy, e.g.:

>>> from celery.task import Task
>>> from celery.registry import tasks

>>> class MyTask(Task):
...
...     def run(self, x, y):
...         return x * y 
>>> tasks.register(MyTask)

>>> async_result = MyTask.delay(2, 2)
>>> retval = async_result.get() # Now synchronous
>>> retval 4

The AMQP result store makes sending back the result very fast, but it's only available in the current development version (in code-freeze to become 0.8.0)

asksol 2009-09-18 17:05:58

Thanks Asksol. One requirement is to have the Task run forever as a daemon, and then just send/receive data from it. LaTeX has to be running before you call run() (otherwise you have to wait for LaTeX to start up, which obviates the entire purpose of using a task queue). I'm looking into Celery to see if it can do this (I expect it can).

Brian M. Hunt 2009-09-21 17:41:29

I don't see the requirement that LaTeX has to be running, apart from being an optimization? To do this you would have to use the LaTeX C API (or whatever it is) to run it embedded inside a worker process. This should be possible, but would require you to customize celery considerably. It might be a good starting point as it solves parts of your problem already. I was not saying the task queue is a good fit for this, but the distributed/parallel processing part *might* be. You want a Task pool and you want to send/receive results, you just want the worker processes to be LaTeX processors.

asksol 2009-09-22 17:12:27

ansaurus

tags:

views:

answers:

In Django, how to call a subprocess with a slow start-up time.

related questions