views:

38

answers:

3

Let's say I have to generate a bunch of result files, and I want to make it as fast as possible. Each result file is generated independently of any other result file; in fact, one could say that each result file is agnostic to every other result file. The resources used to generate each result file is also unique to each. How can I dynamically decide the optimal number of threads to run simultaneously in order to minimize the overall run time? Is my only option to write my own thread manager that watches performance counters and adjust accordingly or does there exists some solid classes that already accomplish this?

+1  A: 

I'd go for threadpool and possibly async file operations. Writing your own thread manager will very likely be worse than what the default scheduler does for you.

Here's a nice article showing some of the problems with doing it yourself... and your code would have to take into account things like HyperThreading (which gives you virtual CPUs only, not real cores, so that the load is not always as expected when watching performance counters).

Lucero
+1  A: 

Will doing this in a multithreaded manner really do more than cause context switching overhead? Unless you have more than one disc you're writing to, you're only going to write one at a time no matter how many threads you throw at it.

glowcoder
+4  A: 

Without further details I would assume that this task is I/O bound and not CPU bound, so you'll probably only add overhead my launching multiple threads. I would recommend using async I/O and thereby let the thread pool handle the details. Admittedly, that may not turn out to be the optimal solution, but it would still be my first attempt as chances are that it will be good enough.

Brian Rasmussen