views:

160

answers:

1

For a new project that I'm working on I need to set up a centralized computing environment with a primary server (master) and several workstations (slaves).

The master will recieve several types of job orders with diferent intervals and expiration dates, example:

NAME        INTERVAL        EXPIRATION        OPERATIONS
Job A       5 m             9 d               2592
Job B       15 m            30 d              2880
Job C       30 m            90 d              4320
Job D       10 m            50 d              7200
Job E       10 m            20 d              2880
Job F       10 m            10 d              1440

The slaves will request new jobs to process via an API interface and once they are finished they will send the relevant data stored in SQLite to the server and request a new job.

I see nothing difficult with the slaves part, however I'm facing a small problem with the master server: what would be the best option to serve the jobs to the slaves? I've though of the following options:

  1. Pre-calculate and add to the database the times when the jobs will be executed.
  2. Serve each job dinamically with some sort of logic (is the MOD operator a possible solution here)?
  3. Randomly serve jobs.

It might be a good thing to say that this last option is really not an option since I'm charging by the number of operations (expiration / interval) and I think my clients wouldn't be too happy if they would be paying for others.

I would like to hear your thoughts on how I should do this.

+1  A: 

You will get a better through-put by allowing the slaves to request a job when they have time to process it.

If the master distributes the jobs, some slaves could be waiting doing nothing while others are overloaded.

You will also need to add some functionality to make a job available again, if the slave does not return an answer within a period of time.

Shiraz Bhaiji