views:

317

answers:

4

Python seems to have many different packages available to assist one in parallel processing on an SMP based system or across a cluster. I'm interested in building a client server system in which a server maintains a queue of jobs and clients (local or remote) connect and run jobs until the queue is empty. Of the packages listed above, which is recommended and why?

Edit: In particular, I have written a simulator which takes in a few inputs and processes things for awhile. I need to collect enough samples from the simulation to estimate a mean within a user specified confidence interval. To speed things up, I want to be able to run simulations on many different systems, each of which report back to the server at some interval with the samples that they have collected. The server then calculates the confidence interval and determines whether the client process needs to continue. After enough samples have been gathered, the server terminates all client simulations, reconfigures the simulation based on past results, and repeats the processes.

With this need for intercommunication between the client and server processes, I question whether batch-scheduling is a viable solution. Sorry I should have been more clear to begin with.

A: 

Given that you tagged your question "scientific-computing", and mention a cluster, some kind of MPI wrapper seems the obvious choice, if the goal is to develop parallel applications as one might guess from the title. Then again, the text in your question suggests you want to develop a batch scheduler. So I don't really know which question you're asking.

janneb
There would be no interaction between client processes, but the queue/server would change based upon individual client process results. I just need to be able to manage a bunch of jobs dynamically across many workstations.
Bryan Ward
Oh. In that case, what most people do is they have scripts for submitting many jobs to an existing batch scheduler like condor or SLURM, parse the output and if necessary submit new jobs etc. That's a lot less effort than writing a custom batch scheduler.
janneb
A: 

The simplest way to do this would probably just to output the intermediate samples to separate files (or a database) as they finish, and have a process occasionally poll these output files to see if they're sufficient or if more jobs need to be submitted.

Noah
+1  A: 

There are also now two different Python wrappers around the map/reduce framework Hadoop:

http://code.google.com/p/happy/

http://wiki.github.com/klbostee/dumbo

Map/Reduce is a nice development pattern with lots of recipes for solving common patterns of problems.

If you don't already have a cluster, Hadoop itself is nice because it has full job scheduling, automatic data distribution of data across the cluster (i.e. HDFS), etc.

alecf
+1  A: 

Have a go with ParallelPython. Seems easy to use, and should provide the jobs and queues interface that you want.

Sam Doshi