views:

396

answers:

6

I need to perform time consuming tasks in an webapplication. Because the tasks can be so heavy that they run for minutes they have to run on multiple threads so the user won't have to look at a loading page for minutes.

So I thought a multithreaded queue would be a good solution. Each instance of a object that you add to the queue should run on its own thread.

I've got a basic idea where to start but I bet that there are much much better solutions already written or in your brains ;).

My solution how the queue should look like:

[
 [
  obj_instance_1,[
                  (function_1, function_args_1, priority_1),
                  (function_2, function_args_2, priority_2),
                 ]
 ],
 [
  obj_instance_2,[
                  (function_n, function_args_n, priority_n),
                 ]
 ]
]

where [] are lists and () are tuples.

+2  A: 

You just need your elements to extend threading.Thread and use Conditions() to implement the producer,consumer system.

I would maintain a thread pool with it's own concurrency control and an add() method, allowing some other code to add threads into the pool.

Here is the documentation for Python threading which pretty much follows the conventions of other thread implementations ... nothing scary.

Aiden Bell
+1  A: 

I'd don't know much about python, but what you're describing sounds like a thread pool - this is from a quick google

http://pypi.python.org/pypi/threadpool/

Whisk
+1 This project shows good use of the standard Queue module for tasks
Van Gale
+2  A: 

kamaelia provides tools for abstracting concurrency to threads or process etc.

Mark
Why reinvent the wheel when Kamaelia already provides a tested framework for this.
Michael Dillon
+6  A: 

The Python standard library Queue module is already thread-safe and aware and should work for your requirements.

Here's a nice paper Task Queue Implementation Pattern that discusses how to use Queue for task queues.

Van Gale
+1 , good answer. I will look further in Queue myself. Coming from a C background, you tend to just reimplement :P
Aiden Bell
A: 

I'd recommend you look at beanstalkd or gearman.

Let your web server be a web server, and scale your long-running jobs independently and more safely by moving them through a queue to an external worker.

Dustin
A: 

I would recommend using process pools from the multithreading library. This is a built in library and abstracts most of the implementaion you need anyway, especially since pools work on lists and your data is already in the form of a list. You can use it with the map_async member function of the pool and assign a callback to notify the user whenever you have finished a particular task.

the dol