ansaurus

Question

Simple non-network concurrency with Twisted

Answer 1

+2 A:

To actually compute things concurrently, you'll probably need to employ multiple Python processes. A single Python process can interleave calculations, but it won't execute them in parallel (with a few exceptions).

Twisted is a good way to coordinate these multiple processes and collect their results. One library oriented towards solving this task is Ampoule. You can find more information about Ampoule on its Launchpad page: https://launchpad.net/ampoule.

Jean-Paul Calderone 2010-03-29 18:21:05

Can you provide example code related to my problem? the does not seem to be any documentation.

Rince 2010-03-30 13:50:38

The examples should get you started. I don't see them hosted anywhere on the web, but if you download the 0.2.0 release, you'll find them in the "examples" directory.

Jean-Paul Calderone 2010-03-30 18:55:49

Answer 2

+2 A:

Do you need Twisted at all?

From your description of the problem I'd say that multiprocessing would fit the bill. Create a number of Process objects that are given a reference to a single Queue instance. Get them to start their work and put their results on the Queue. Just use blocking get()s to read the results.

quamrana 2010-03-29 20:39:28

Sadly my institution uses Python 2.5 and does not have any plans of going to Python 2.6 as for now. So no multiprocessing goodness.

Rince 2010-03-30 13:52:21

Except that multiprocessing is available as a back port from 2.6

quamrana 2010-03-30 14:06:47

Answer 3

+4 A:

As Jean-Paul said, Twisted is great for coordinating multiple processes. However, unless you need to use Twisted, and simply need a distributed processing pool, there are possibly better suited tools out there.

One I can think of which hasn't been mentioned is celery. Celery is a distributed task queue - you set up a queue of tasks running a DB, Redis or RabbitMQ (you can choose from a number of free software options), and write a number of compute tasks. These can be arbitrary scientific computing type tasks. Tasks can spawn subtasks (implementing your "joining" step you mention above). You then start as many workers as you need and compute away.

I'm a heavy user of Twisted and Celery, so in any case, both options are good.

rlotun 2010-03-29 21:19:11

Can you provide some example code, pretty please?

Rince 2010-03-30 13:53:02

Well, I'll use the example on the celery website. To mirror the example you have above, you'd first write a number of tasks. A task is essentially your dataWorker: from celery.decorators import task @task def dataWorker(chara)): return ord(chara)You can write as many tasks as you please - and conceptually they're just functions that *do something*. Then, elsewhere - perhaps in your dataServer - you simply schedule the task: result = dataWorker.delay(chara)You can think of the result as a deferred - you can either wait on the result or check on it later.

rlotun 2010-03-30 15:18:22

Ok, I forgot that code in comments don't show up well, but essentially check the celery website for a near analogue of what you're trying to do. Remember you have three components: 1) Tasks, which are run by workers 2) A Queue system to hold the tasks 3) A place to store results of tasks. Celery can work with Django seamlessly as well.

rlotun 2010-03-30 15:20:37

Answer 4

+1 A:

It seems to me that you are misunderstanding the fundamentals of how Twisted operates. I recommend you give the Twisted Intro a shot by Dave Peticolas. It has been a great help to me, and I've been using Twisted for years!

HINT: Everything in Twisted relies on the reactor!

The Reactor Loop

jathanism 2010-04-16 03:33:04

ansaurus

tags:

views:

answers:

Simple non-network concurrency with Twisted

related questions