views:

53

answers:

2

Hello,

I am working on a simple server that exposes webservices to clients. Some of the requests may take a long time to complete, and are logically broken into multiple steps. For such requests, it is required to report progress during execution. In addition, a new request may be initiated before a previous one completes, and it is required that both execute concurrently (barring some system-specific limitations).

I was thinking of having the server return a TaskId to its clients, and having the clients track the progress of the requests using the TaskId. I think this is a good approach, and I am left with the issue of how tasks are managed.

Never having used the TPL, I was thinking it would be a good way to approach this problem. Indeed, it allows me to run multiple tasks concurrently without having to manually manage threads. I can even create multi-step tasks relatively easily using ContinueWith.

I can't come up with a good way of tracking a task's progress, though. I realize that when my requests consist of a single "step", then the step has to cooperatively report its state. This is something I would prefer to avoid at this point. However, when a request consists of multiple steps, I would like to know which step is currently executing and report progress accordingly. The only way I could come up with is extremely tiresome:

Task<int> firstTask = new Task( () => { DoFirstStep(); return 3.14; } );
firstTask.
ContinueWith<int>( task => { UpdateProgress("50%"); return task.Result; } ).
ContinueWith<string>( task => { DoSecondStep(task.Result); return "blah"; }.
ContinueWith<string>( task => { UpdateProgress("100%"); return task.Result; } ).

And even this is not perfect since I would like the Task to store its own progress, instead of having UpdateProgress update some known location. Plus it has the obvious downside of having to change a lot of places when adding a new step (since now the progress is 33%, 66%, 100% instead of 50%, 100%).

Does anyone have a good solution?

Thanks!

A: 

I don't think the solution you are looking for will involve the Task API. Or at least, not directly. It doesn't support the notion of percentage complete, and the Task/ContinueWith functions need to participate in that logic because it's data that is only available at that level (only the final invocation of ContinueWith is in any position to know the percentage complete, and even then, doing so algorithmically will be a guess at best because it certainly doesn't know if one task is going to take a lot longer than the other. I suggest you create your own API to do this, possibly leveraging the Task API to do the actual work.

Kirk Woll
Yes, I realized after posting the question that when using ContinueWith(...) there is actually no one big task that contains all other tasks, but rather a chain of tasks, so that any single task cannot know what's ahead of it. Thanks for your help!
circular designer
A: 

This isn't really a scenario that the Task Parallel Library supports that fully.

You might consider an approach where you fed progress updates to a queue and read them on another Task:

static void Main(string[] args)
{
    Example();
}

static BlockingCollection<Tuple<int, int, string>> _progressMessages = 
    new BlockingCollection<Tuple<int, int, string>>();

public static void Example()
{
    List<Task<int>> tasks = new List<Task<int>>();

    for (int i = 0; i < 10; i++)
        tasks.Add(Task.Factory.StartNew((object state) =>
            {
                int id = (int)state;
                DoFirstStep(id);
                _progressMessages.Add(new Tuple<int, int, string>(
                    id, 1, "10.0%"));
                DoSecondStep(id);
                _progressMessages.Add(new Tuple<int, int, string>(
                    id, 2, "50.0%"));

                // ...

                return 1;
            },
            (object)i
            ));

    Task logger = Task.Factory.StartNew(() =>
        {
            foreach (var m in _progressMessages.GetConsumingEnumerable())
                Console.WriteLine("Task {0}: Step {1}, progress {2}.",
                m.Item1, m.Item2, m.Item3);
        });


    List<Task> waitOn = new List<Task>(tasks.ToArray());
    waitOn.Add(logger);
    Task.WaitAll(waitOn.ToArray());
    Console.ReadLine();
}

private static void DoSecondStep(int id)
{
    Console.WriteLine("{0}: First step", id);
}

private static void DoFirstStep(int id)
{
    Console.WriteLine("{0}: Second step", id);
}

This sample doesn't show cancellation, error handling or account for your requirement that your task may be long running. Long running tasks place special requirements on the scheduler. More discussion of this can be found at http://parallelpatterns.codeplex.com/, download the book draft and look at Chapter 3.

This is simply an approach for using the Task Parallel Library in a scenario like this. The TPL may well not be the best approach here.

If your web services are running inside ASP.NET (or a similar web application server) then you should also consider the likely impact of using threads from the thread pool to execute tasks, rather than service web requests:

http://stackoverflow.com/questions/2753273/how-does-task-parallel-library-scale-on-a-terminal-server-or-in-a-web-application

Ade Miller
Thanks for the detailed response! I agree that TPL may not be the best approach here. I was hoping to get the chance to use it, but it may have to wait for another day...
circular designer