ansaurus

Question

Implementing Load Balancing Using Python

Answer 1

+1 A:

Here's a simple way to do this.

Create a single common shared queue of work to do. This application will fill this queue with work to do.
Create an application which gets one item from the queue, and does the work.

This is the single-producer-multiple-consumer design. It works well and can swamp your machine with parallel processes.

To use the built-in queue class, you need to wrap the queue in some kind of multi-processing API. http://docs.python.org/library/queue.html. Personally, I like to create a small HTTP-based web-server that handles the queue. Each application does a GET to fetch the next piece of work.

You can use tools like RabbitMQ to create a very nice shared queue. http://nathanborror.com/posts/2009/may/20/working-django-and-rabbitmq/

You might be able to use http://hjb.python-hosting.com/ to make use of JMS queues.

You'll need a small application to create and fill the queue with work.

Create as many copies of the application as you like. For example:

for i in 1 2 3 4 5 6 7 8 9 10
do
    python myapp.py &
done

This will run 10 concurrent copies of your application. All 10 are trying to get work from a single queue. They will use all available CPU resources and the OS will nicely schedule them for you.

Peer, node-to-node synchronization means you have O(n*(n-1)/2) communication paths among all nodes.

The "two-adjacent nodes" means you still have 2*n communication paths and work has to "somehow" trickle among the nodes. If the nodes are all initially seeded with work, then someone did a lot of planning to balance the workload. If you're going to do that much planning, why ask the nodes to synchronize at all?

If the queues are not carefully balanced to begin with than every even node could be slow. Every odd node could be fast. The odd nodes finish first, check for work from two even nodes, and those nodes are (a) not done and (b) don't have more work to do, either. What now? Half the nodes are working, half are idle. All due to poor planning in the initial distribution of work.

Master-slave means you have n communication paths. Further, the balancing is automatic since all idle nodes have equal access to work. There's no such thing as a biased initial distribution that leads to poor overall performance.

S.Lott 2010-07-23 19:41:03

Thank you, this is similar to the current design of the program, but they are wanting to have a node to node structure rather than a master-slave structure. Also would having a single queue being accessed many times from multiple processes create a possible bottleneck on a cluster?

DistortedLojik 2010-07-23 19:44:16

@DistortedLojik: Single queue should not be a problem unless each work packet is microscopically small. It's pure overhead so you want the cost of queue access to be amortized over a lot of high-value work.

S.Lott 2010-07-23 19:47:43

ansaurus

tags:

views:

answers:

Implementing Load Balancing Using Python

related questions