views:

36

answers:

2

Please,

We have the following situation:

Component X that divides a request file into parts, sending each part to an independent processing component Y -through a network- that will reply with a result to component Z , component Z collects all the results of the file parts into a batch result file.

Note:- Request file: file contains N number of data records that need to be processed.

What is the best practice for this situation? Is there any protocol for that?, Is there any library that can help? design patterns ??

thanx in advance.

+1  A: 

A message service queue/bus like RabbitMQ might be of help. Using this service, you can connect all the distributed components together & dispatch/collect results in a reliable manner.

Whilst the service bus won't solve all your "problems", it probably would address the distributed & reliable communication bits.

jldupont
will the results be pushed in the same requests queue?
Moro
@Moro: you decide: you have fine-grain control over the queues.
jldupont
+1  A: 

We use the command pattern to queue up units of work to process very large files.

The commands themselves are stored in a database (command entries are things like "process File X, line 1-100", "process File X, line 101-200", etc.). Any number of servers from a server farm can pick up one command, indicate they are working on it, and write back their result. A controller looks for abandoned work (picked up but no result written within X minutes) and can reset the work to be eligible for pickup again.

Eric J.
How does each server get its -data content- part of the file? is the file shared between the servers?
Moro
Our current implementation reads the lines of the file into a temporary table. However, if I had time to redo the implementation, I would probably use a shared location. We run on Amazon AWS, so that shared location would probably be S3 for us.
Eric J.