views:

34

answers:

3

Hello from Spain,

I have an issue here. Suppose you have a business periodical task, for instance generating balance each month. That task could run in a farm, so if the computer that is running the periodical task fail it must be passed to another computer.

So, how could I persist a periodical task and making it safe in a farm? I have thinked in a persistent and shared queue, but I am quite stuck.

Any ideas?

Thanks in advance.

+1  A: 

Going from 1 to >1 is always a big step.

Option 1

You'll need (amongst other things):

  • Supervisor machines
  • Heart-beat protocol / failure detection
  • Election protocol for Job scheduler
  • Job Scheduler
  • Job reporting

etc.

Option 2

Just have multiple machines do the Task and select the successful output from one of them (or use majority voting if you feel you need it).

jldupont
is my answer satisfactory?
jldupont
+1  A: 

There are a lot of factors missing on your question. How are you accessing the task? Is it a web service? A remote procedure call? Does it run on its own and then store the results on a share folder?

If it's just web services then solution could be just to query them in order, if one is unavailable then proceed to the next... probably rpc could just be handled with the same procedure. Of course this does not scale so well and it's a little ad hoc but it might just do the trick if you don't have time for anything else.

If you have the time AND money needed to really scale you should take a look to control reconfiguration which is the basis for the kind of fault tolerance you're seeking. Of course that will imply a controller (supervision machines as @jldupont calls them) as well as a lot of mechanism and effort to keep everything together.

It's worth it only if you really need it. Is a great investment in both time and money so don't boldly do it just because is cool.

Jorge Córdoba
+1  A: 

Not sure what your technology stack is, but take a look at Quartz (or Quartz.net if you use the .NET stack). Quartz is an enterprise job scheduler with robust fail-over/HA capabilities.

Jason