views:

92

answers:

0

For a Python/Django/Celery based deployment tool, we have the following setup:

  1. We currently use the default Celery setup. (One queue+exchange called "celery".)
  2. Each Task on the queue represents a deployment operation.
  3. Each task for an environment ends with a synchronisation phase that potentially takes (very) long.

The following specs need to be fulfilled:

  1. Concurrency: tasks for multiple environments should be carried out concurrently.
  2. Locking: There may be at most one task running for each environment at the same time (i.e. environments lock).
  3. Throughput optimization: When there are multiple tasks for a single environment, their sync phases may be combined for optimization. So if a task comes near its ending, it should check if there are new tasks waiting in the queue for this environment and, if so, skip its sync phase.

What is the preferred way to implement this?

Some thoughts:

  • I would say we have to set up multiple queues: one for each environment, and have N celery workers processing a single queue exclusively, each. (This would solve spec 1+2.)
    But how do we get multiple celery workers to listen to different queues exclusively?
  • Is there a clean way of knowing there are more tasks waiting in the queue for an environment?