What I'm looking for is any/all of the following:
- automatic discovery of worker failure (computer off for instance)
- detection of all running (linux) PCs on a given IP address range (computer on)
- ... and auto worker spawning (ping+ssh?)
- load balancing so that workers do not slow down other processes (nice?)
- some form of message passing
... and don't want to reinvent the wheel.
C++ library, bash scripts, stand alone program ... all are welcome.
If you give an example of software then please tell us what of above functions does it have.