views:

160

answers:

6

While multithreading is faster in some cases, sometimes we just want to spawn multiple worker processes to do work. This has the benefits of not crashing the main app if one of the worker crashes, and that the user doesn't need to worry a lot about inter-locking stuffs.

COM+'s Application Pooling seems like a good way to achieve this on Windows. The downside is that we need to write a COM+ wrapper for the worker process.

However, when I search for Application Pooling on Google, it seems like most of its usages are related to IIS. Don't other applications (such as scientific/graphics) find it useful to spawn multiple worker processes?

So there are several questions:

  • Why isn't COM+ more popular in areas other than IIS? If I write a non-IIS application and want to use process management on Windows, should I go with COM+ or are there better alternatives out there?

  • What would be the cross platform way to do it? Are there libraries out there that give me a "process pool" (worker processes will intelligently pick up work, can be managed, etc.)

+3  A: 

You might want to investigate how the apache web server manages process pools. From version 2.0 it runs natively on windows and one of the multi-processing models it supports are process pools. A part of apache is also APR (apache portable runtime), which handles platform-specific issues.

zvrba
+2  A: 

No one can answer why something is not popular because may be no body is looking for what you are looking for. After .NET came in picture, people shifted from COM to Managed Environment, before .NET, COM and ATL and relative other technologies were quite painful to implement and they would crash and were also quite difficult to debug.

That is the reason, managed environment came in existence.

However, .NET 4 onwards, parallel libraries give much more power to user for parallel programming and also you can spawn and control other proceeses.

For multiplatform, you can look for zvrba's answer.

Akash Kava
Thanks for letting me know about .NET 4's Parallel Extensions! It's obviously a very edgy thing and not many people have written howtos, etc. about it.
kizzx2
@Kizzx2, its very new, and I guess it was released in beginning of 2010, but in MSDN documentation Microsoft has provided good insight that is easy to understand and implement.
Akash Kava
+2  A: 

Yes, other applications--especially science applications--find it useful to spawn multiple processes. Since few super-computers run Microsoft Windows, scientists generally avoid using anything that ties them to a Microsoft platform. Nothing related to COM will help scientists leverage their enormous existing code base written in Fortran.

People who choose to run IIS have generally already drunk the Microsoft Koolaid, so they have fewer inhibitions to tying themselves to Microsoft's proprietary platforms, which is why COM-specific terminology will get lots of hits related to IIS.

One of the open standards for doing what you want is the Message Passing Interface. Several implementations exist and some of them run on supercomputers using Fortran. Some of them run on cheaper computers using sexier languages.

See http://en.wikipedia.org/wiki/Message_Passing_Interface

bbadour
I looked into MPI once. I was thinking that it would be the ultimate way to run parallel programs. For my scope, though, I just want to run a "process pool" on a local computer. For all the MPI implementations I have looked, it required starting a process with some sort of `mpirun` commands and the basing the whole application on top of the MPI platform. I guess that's too far off for my use case here.
kizzx2
The other cross platform methods are fork or exec using sockets, pipes, RPC, a message queue or some combination for interprocess communication. The same calculus regarding vendor lock-in applies.
bbadour
MPI tends to be pretty poor as a solution for multithreading. It tends to be much better at managing large-scale parallelism, not general concurrency.
Gian
Indeed, threads are much better for handling multithreading. Multiple processes are better at the things enumerated in the original question. COM+ is no doubt at least nearly as heavy-weight as MPI.
bbadour
+2  A: 

There hasn't been a mob rushing through the doors of COM application pooling primarily because of two factors:

  1. COM is a pain in the ass to deal with compared to just about anything else
  2. Threading can be a headache, but it's a lot easier and more convenient to manage than inter-process communication

COM application pooling was essentially created for IIS. It has one very specific benefit over normal multithreading: the multiple processes are fully isolated from each other. This is important for data security and for app stability when dealing with third party plugins of questionable stability.

Scientific computing generally doesn't need strong data security isolation between operations, and I would venture to guess that scientific computing doesn't rely much on third party plugins of questionable stability. When doing big math operations, you're either using a sexy numerics library that had better be rock solid to be taken seriously, or you're using your own code, in which case crashes should be fixed and repeat offenders should be spanked.

Oh, and all crashes except stack overflow can be trapped and dealt with within a multithreaded app, especially if it's your own code.

In short, COM app pooling is overkill for just about anything other than IIS.

dthorpe
1) When COM is a pain in the ass compared to "anything else," what is that anything else? 2) I guess in High Performance Computing, multi-process is the only way to go -- when your architecture scales across multiple machines or even networks.
kizzx2
3) Do you mean trapped by a `try, catch` block? I think multi-process is more stable because even if I catch an error in a thread, I often don't know how to deal with it. In a multi-process environment I'd just kill myself and let other people pick up the rest. In a multi-threaded environment, I can't just say "I don't know how to deal with this -- start over!" because starting over for myself means starting over for everybody. Yes, it can be done, with a well designed fault handling mechanism, then we start to re-invent what process separation in the OS originally offered.
kizzx2
So in direct response to the original question "What are the common strategies?", your answer would be "the common strategy is that people don't do it -- they use threading instead"?
kizzx2
+2  A: 

I can't offer any answers to the COM aspect of your question, but it's worth noting there's another world (besides HPC MPI) where multi-processing (rather than the more common multi-threading approach) is apparently alive, well and thriving: Python.

Why ? Python's GIL ("global interpreter lock") cripples most attempts to multithread python code so badly that multiprocessing is the generally recommended approach to parallelising Python on SMP. The standard library includes process pools; there are various other options too.

Python certainly ought to satisfy any multi-platform requirement!

timday
Excellent! I admit my original question about the COM part was a little subjective. Python is an excellent choice and the fact that you can call Python from native code (C) seems to make this less intrusive than having to adopt a full library (APR) (again, subjective).
kizzx2
A: 

Google's webbrowser chrome is a multi-process architecture software. It is open source, so you can check out its code and see how to manage processes.

tuesday