views:

324

answers:

3

I'm firing off tasks using an ExecutorService, dispatching tasks that need to be grouped by task-specific criteria:

Task[type=a]
Task[type=b]
Task[type=a]
...

Periodically I want to output the average length of time that each task took (grouped by type) along with statistical information such as mean/median and standard deviation.

This needs to be pretty fast, of course, and ideally should not cause the various threads to synchronize when they report statistics. What's a good architecture for doing this?

+4  A: 

ThreadPoolExecutor provides beforeExecute and afterExecute methods that you can override. You could use those to record your statistics in a single (member variable of your ExecutorService) ConcurrentHashMap keyed on some unique identifier for your tasks, and storing the type, start time, and end time.

Calculate the statistics from the ConcurrentHashMap when you are ready to look at them.

Adam Jaskiewicz
+2  A: 

Subclass Thread Pool Executor and track the execution events:

It's worth noting that the methods are invoked by the worker thread which executes the task, so you need to insure thread safety for the execution tracking code.

Also, the Runnables you will receive will most likely not be your Runnables, but wrapped in FutureTasks.

Robert Munteanu
Your last point is very important. I had the same question for Callables submitted to a ThreadPoolExecutor. Unfortunately the Runnable that comes into beforeExecute is a FutureTask wrapping my Callable (and it looks like the same thing would be true of a submitted Runnable). As a result, there is no easy way to access your original Runnable/Callable. Disappointing... :( Looks like I'll be overriding submit() as well to keep a Map from Future to Callable. I hope it doesn't slow things down too much.
Matt Passell
+1  A: 

I believe the two other answers are correct, but maybe a bit too complicated (although my answer, while simple, is probably not quite as performant as theirs.

Why not just use Atomic variables to keep track of your stats? Such as number of tasks run, total execution time (divided by total number, you get avg execution time). Pass these variables into your Runnable for each task. Unless your tasks as extremely short lived I do not think the overhead of locking an Atomic variable will impact you.

Gandalf