views:

61

answers:

2

I'm looking for a framework (open source) for distributed computing for .Net / Mono that is not simply task-based but supports persistence of distributed tasks.

The project at hand is a complex system simulation which needs to be distributed into smaller independent "subsimulations". These subsimulations will keep running for a long time and will at intervals receive and send back data to the Master where a View with aggregate results is being updated and presented to the user.

So the work to be distributed (the subsimulations) is stateful and should remain in existence (on or offline) at the Workers for a long time, across multiple sessions. This will require local persistent storage (serialization) by the Worker (the subsimulations are quite large and it will not be efficient to send them back and forth to the Master for every session).

The framework should offer transparency as to the actual underlying network or cloud platform that is being used by allowing different implementations to be used (i.e. local cluster, Internet, single machine, 3rd party cloud platforms).

It would be nice if at the model/simulation level, performance could be tuned depending on network latency (for instance by adjusting the frequency and granularity of data that is being sent between Workers and Master).

I looked at NGrid but this seems unfinished and dated. I also looked at some of the other usual suspects (MPAPI, MPI.NET, Alchemi, etc.) but as far as I can tell these don't meet the requirements. If no such framework exists, I'm interested in tips on the design of such a framework.

A: 

Have a look at Gearman (http://gearman.org/). It provides most of the requirements you mentioned. The main problem of Gearman is that the server has to be hosted on a Linux machine (or you use the PEARL or Python implementation of the server). The nodes can run on any environment / platform.

Nicolas
Thanks, looks interesting. Have already started designing my own solution using WCF though. I guess my needs may be too application specific for a general framework.
Peladao
+1  A: 

Have you looked at the Microsoft DSS / CCR framework? It's a SOA, Concurrent framework initially developed on the robotics platform. We previously used it to create a traffic simulator. It's not open per-se but its not to expensive, and I believe free for academia.

You would have to write the logic to create parallel jobs, but this in theory should not be to difficult. The framework has a bunch of management tools.

Hadoop is also another alternative and probably what I'd recommend. The storage requests you made are more viable with this solution using the hadoop file system.

http://wiki.apache.org/hadoop/HadoopStreaming http://stackoverflow.com/questions/339344/is-there-a-net-equivalent-to-apache-hadoop

Also in the above SO thread is the myspace technology http://code.google.com/p/qizmt/

steve
DSS/CCR seems interesting. It's free now according to MS. Have to look deeper into it still.
Peladao