views:

240

answers:

8

We aim to implement a distributed system on a cluster, which will perform resource-consuming image-based computing with heavy storage I/O, having following characteristics:

  1. There is a dedicated manager computer node and up to 100 compute nodes. The cluster must be easily expandable.
  2. It is built around job-task concept. A job may have one to 100,000 tasks.
  3. A job, which is initiated by the user on the manager node, results in creation of tasks on the compute node.
  4. Tasks create other tasks on the fly.
  5. Some tasks may run for minutes, while others may take many hours.
  6. The tasks run according to a dependency hierarchy, which may be updated on the fly.
  7. The job may be paused and resumed later.
  8. Each task requires specific resources in terms of CPU (cores), memory and local hard disk space. The manager should be aware of this when scheduling tasks.
  9. The tasks tell their progress and result back to the manager.
  10. The manager is aware if the task is alive or hanged.

We found Windows HPC Server 2008 (HPCS) R2 very close by concept to what we need. However, there are a few critical downsides:

  1. Creation of tasks is getting exponentially slower with increasing number of tasks. Submitting more than several thousands of tasks is unbearable in terms of time.
  2. Task is unable to report its progress back to the manager, only job can.
  3. There is no communication with the task during its runtime, which makes it impossible to check if the task is running or may need restarting.
  4. HPCS only knows nodes, CPU cores and memory as resource units. We can't introduce resource units of our own (like free disk space, custom hardware devices, etc).

Here's my question: does anybody know and/or had experience with a distributed computing framework which could help us? We are using Windows.

A: 

Have you looked at Beowulf? Lots of distributions to choose from, and lots of customization options. You ought to be able to find something to meet your needs...

Craig Trader
I need a windows tool, because it has to run more or less existing code.
Pavel Radzivilovsky
Beowulf is not a true cluster or distributed computing framework since it exists only as a concept, not as a standardized collection of software, APIs, etc.
Joel Hoff
The same could be said for Linux, which is why I noted that there are specific Beowulf distributions (just like there are Linux distributions). I didn't recommend a specific distribution because (1) I'm not that familiar with them, and (2) the correct distribution to choose would depend upon the OP's needs, not mine.
Craig Trader
+3  A: 

I would take a look at the Condor high throughput computing project. It supports windows (and linux, and OSX) clients and servers, handles complex dependencies between tasks using DAGman and can suspend (and even move) tasks. I've experience of systems based on Condor that scale to thousands of machines across university campuses.

Andrew Walker
A: 

I would recommend Beowulf cause Beowulf behaves more like a single machine rather than many workstations.

johnny26
Beowulf is not a true cluster or distributed computing framework since it exists only as a concept, not as a standardized collection of software, APIs, etc.
Joel Hoff
A: 

give gridgain a try. This should make runtime addition of nodes very easy, and you can monitor/manage the cluster using jmx interfaces

Nikolaus Gradwohl
seems like gridgain talks a lot about java, and emposes hard constraints on how your tasks are (already) built.
Pavel Radzivilovsky
A: 

Have you examined the SunGrid Engine? It's been a long time since I used it, and I never used it to its full capabilities, but this is my understanding.

  1. There is a dedicated manager computer node and up to 100 compute nodes. The cluster must be easily expandable. yes
  2. It is built around job-task concept. A job may have one to 100,000 tasks. not sure
  3. A job, which is initiated by the user on the manager node, results in creation of tasks on the compute node. yes
  4. Tasks create other tasks on the fly. I think so?
  5. Some tasks may run for minutes, while others may take many hours. yes
  6. The tasks run according to a dependency hierarchy, which may be updated on the fly. not sure
  7. The job may be paused and resumed later. not sure
  8. Each task requires specific resources in terms of CPU (cores), memory and local hard disk space. The manager should be aware of this when scheduling tasks. pretty sure
  9. The tasks tell their progress and result back to the manager. pretty sure

` 10. The manager is aware if the task is alive or hanged. yes

Paul Nathan
Would SunGrid Engine be able to run Windows software? It appears to run on Windows using Unix Services. Although it's not listed in the requirements Paval says this is a must on a comment.
Vitor Py
@Vitor, @Paul: Grid Engine is ruled out by @Pavel's requirement to run on Windows. Yes, you can use GE on Window's with Unix Services as submit hosts (ie you can submit jobs to a cluster from a Windows machine) but the execution host (ie the grid where the jobs are executed) must be a Linux/Unix/Solaris system. Apart from that minor flaw GE would be nearly ideal !
High Performance Mark
@High: Ugh. I had hoped that it would be Windows-able. Sigh.
Paul Nathan
A: 

If you don't mind hosting your project in a cloud, you might want to have a look at Windows Azure / Appfabric. AFAIK it allows you to distribute your jobs via workflows and you can dynamically add more worker machines to handle your jobs as the load increases.

Adrian Grigore
+2  A: 

Platform LSF will do everything you need. It runs on Windows. It is commercial, and can be purchased with support.

Yes. 1. There is a dedicated manager computer node and up to 100 compute nodes. The cluster must be easily expandable.

Yes 2. It is built around job-task concept. A job may have one to 100,000 tasks.

Yes 3. A job, which is initiated by the user on the manager node, results in creation of tasks on the compute node.

Yes 4. Tasks create other tasks on the fly.

Yes 5. Some tasks may run for minutes, while others may take many hours.

Yes 6. The tasks run according to a dependency hierarchy, which may be updated on the fly.

Yes 7. The job may be paused and resumed later.

Yes 8. Each task requires specific resources in terms of CPU (cores), memory and local hard disk space. The manager should be aware of this when scheduling tasks.

Yes 9. The tasks tell their progress and result back to the manager.

Yes 10. The manager is aware if the task is alive or hanged.

semiuseless
A: 

You can definitely solve this sort of problem using Data Synapse Grid Server.

  1. There is a dedicated manager computer node and up to 100 compute nodes. The cluster must be easily expandable. Yes, a Broker can easily handle 2000 Engines.
  2. It is built around job-task concept. A job may have one to 100,000 tasks. Yes, I have queued in excess of 250,000 tasks without issue. Eventually you will run out of memory.
  3. A job, which is initiated by the user on the manager node, results in creation of tasks on the compute node. yes
  4. Tasks create other tasks on the fly. It can be done, although I would not recommend this sort of model
  5. Some tasks may run for minutes, while others may take many hours. yes
  6. The tasks run according to a dependency hierarchy, which may be updated on the fly. yes, but I would manage this outside of the grid computing infrastructure
  7. The job may be paused and resumed later. yes
  8. Each task requires specific resources in terms of CPU (cores), memory and local hard disk space. The manager should be aware of this when scheduling tasks. yes
  9. The tasks tell their progress and result back to the manager. yes

` 10. The manager is aware if the task is alive or hanged. yes

John Channing