views:

104

answers:

4

Hi there,

I have a generic check that needs to be run on ca. 1000 objects. The check takes about 3 seconds. We have a server with 4 processors (and we also have other multi-processor servers in our network) so we would like to create an exe / dll to do the checking and return the results to the "master".

Does anyone know of a framework for this, or how would one go about it in C#?

Specifically:

  • What's the best way to transfer data between the master and the worker process?
  • How would the master ensure that always 4 processes are running at any one time and as soon as a worker process is finished start a new one.
  • How to register that the worker is finished and append it's results to a list?

Hope it's clear enough but happy to clarify.
A.

Some clarifications
* There is actually no interprocess communication outside the call and return of the process e.g.

     ResultObject = WorkerProcess(HeresYourDataSonDoSomethingWithIt);
* Initially one machine is a must, but thinking further down the line we will probably have some cases where we have 6000 objects to check and we would like to farm it out to multiple servers, so we would like to make the right design choice from the beginning, or at least not develop a solution for one server that would have to be completely rewritten for multiple. Thanks!

+1  A: 

Take a look at the builtin ThreadPool. You can queue a "work item" for the thread pool. .NET will manage the threads, start new threads as required and make sure the "idle" threads get new work. Also, work should be equally distributed among the CPUs by the thread pool.

Thorsten Dittmar
The OP mentioned that other servers might participate, threading on its own won't suffice over the network.
Adam
+3  A: 

Hi, The best way to transfer data between processes in C# is using .NET remoting. As the machines are on the same network you can use binary serialisation and IPC channels which should be very fast. You don't need 4 processes on the server machine, just multiple threads. If using .NET 4.0 check out the parallel extensions and that can greatly simplify your code, if you are not familiar with multi-threading and the pitfalls that come with it.

Also, this is probably overkill for your needs but you could look at DryadLINQ from the Microsoft Research Labs.

Phill
WCF supports cross-process comms, .NET remoting is the old way.
Adam
+2  A: 

If you must use separate processes you might want to consider defining all the tasks and put them in a queue. MSMQ and SQL Server are two options.

Then let each process pull a task from the queue until the queue is empty.

With four processors, you could simply spin up four polling processes and give each affinity to a particular CPU.

Mark Seemann
I would advise not giving affinity - let the OS handle that.
Adam
@Adam: I agree, but the OP asked about how to ensure that all four processors are in use...
Mark Seemann
True. If the work is heavy enough, however, and the server isn't running anything else, this will occur naturally. If the OS is running something else and affinity is forced, it will potentially starve other processes - this is why I said "advise" :)
Adam
+1 for MSMQ eitherway, fully supported framework for this sort of situation - has message size limitations though.
Adam
Not sure if we "must" use seperate processes, but we have an identical job to be done on a large amount of objects. Having all 4 processors in use is not important. The server will be dedicated to the task.
Andrew White
In that case, .NET 4 contains much improved support for parallelism...
Mark Seemann
+1  A: 

Your bigger concern here should be reliability. If you use worker processes instead of threads, especially when those processes run on other machines, the odds for failure significantly increase. That kind of failure can be very difficult to deal with, it is hard to recover from a process dying on an unhandled exception or somebody tripping over a power cord.

Avoid committing to an architecture that significantly complicates the code but might never actually be used. This will be much easier to do when you use threads. That can still scale in the long run, many-core CPUs are the future.

WCF is otherwise the correct technology to use.

Hans Passant