tags:

views:

361

answers:

4

If I need generate a fairly large dataset using LINQ and it may take a while (say a few seconds) and I need to (would like to) generate feedback to the use as to %'age done, is there an easy/ preferred way to do this?

Example, say I have list A with 1000 cars and list B with 1000 trucks and I want to select all possible ordered (car, truck) pairs where car.color == truck.color link this:

var pairs = from car in A 
            from truck in B 
            where car.color==truck.color 
            select new {car, truck};

Now at some point this will be evaluated as a set of nested foreach loops. I would like to be able to report %'age complete as it interates and ideally update a progressbar or something.

EDIT: Just after my query, I store the result in a member variable as a list like this (which forces the query to execute):

mPairs = pairs.ToList();

I do this because I am executing this in a background worker thread as I do not want the UI thread to freeze up as it evaluates the LINQ expression on demand on the UI thread (this is in Silverlight BTW). Hence why I would like to report progress. The UX is basically this:

  1. A user drags an item onto the workspace
  2. The engine then kicks up on a background thread to determine the (many) connection possibilities to all of the other items on the workspace.
  3. While the engine is calculating the UI does not allow new connections AND reports progress to indicate when the new item will be "connectable" to the other items (all the possible connection paths not already in use have been determined via LINQ).
  4. When the engine completes the calculation (query), the item is connectable in the UI and the possible connection paths are stored in a local variable for future use (e.g. when the user clicks to connect the item all the possible paths will be highlighted based upon what was calculated when it was added)

(a similar process must happen on deletion of an item)

+2  A: 

EDIT: This doesn't currently work because query expressions don't allow braces. Editing...

You could always add a "no-op" select or where clause which showed progress:

public class ProgressCounter
{
    private readonly int total;
    private int count;
    private int lastPercentage;

    public ProgressCounter(int total)
    {
        this.total = total;
    }

    public void Update()
    {
        count++;
        int currentPercentage = (count * 100) / total;
        if (currentPercentage != lastPercentage)
        {
            Console.WriteLine("Done {0}%", currentPercentage);
            lastPercentage = currentPercentage;
        }
        return true;
    }
}

...

var progressCounter = new ProgressCounter(A.Count * B.Count);

var pairs = from car in A
            from truck in B
            where progressCounter.Update()
            where car.color==truck.color
            select new {car, truck};

Note the use of side-effects, which is always nasty. I hope you'd use a join if this were really the query, btw :)

We've been thinking of adding a sort of operator like this to MoreLINQ - called Pipe, Apply, Via, or something like that.

Jon Skeet
+1  A: 

Most of Linq is done using lazy evaluation. So the query is actually not executed until you foreach over the result. Every time you 'pull' a result from pairs a piece of the query is evaluated.

That means you can just display progres in the foreach loop that iterates over the result. The downside is that you dont know in advance how big the result-set is going to be, counting the size of the result-set will also iterate over the results and execute the query.

Mendelt
A: 

Well mine was similar to Jon's although I'm sure his approach would be much more concise. You can hack something together along the same means..

var pairs = from car in A
            from truck in B
            let myProgress = UpdateProgress(...)
            where car.color == truck.color
            select new { car, truck };

private int UpdateProgress(...)
{
    Console.WriteLine("Updating Progress...");
    return -1;
}

Although as mentioned the query won't be executed until it is iterated over. This also as the added disadvantage of creating a new scope variable inside the query.

Quintin Robinson
+1  A: 

Something I used that worked well was an adapter for the DataContext that returned a count of the number of items it's yielded.

public class ProgressArgs : EventArgs
{
    public ProgressArgs(int count)
    {
        this.Count = count;
    }

    public int Count { get; private set; }
}

public class ProgressContext<T> : IEnumerable<T>
{
    private IEnumerable<T> source;

    public ProgressContext(IEnumerable<T> source)
    {
        this.source = source;
    }

    public event EventHandler<ProgressArgs> UpdateProgress;

    protected virtual void OnUpdateProgress(int count)
    {
        EventHandler<ProgressArgs> handler = this.UpdateProgress;
        if (handler != null)
            handler(this, new ProgressArgs(count));
    }

    public IEnumerator<T> GetEnumerator()
    {
        int count = 0;
        foreach (var item in source)
        {
            // The yield holds execution until the next iteration,
            // so trigger the update event first.
            OnUpdateProgress(++count);
            yield return item;
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

Usage

var context = new ProgressContext(
    from car in A 
    from truck in B 
    select new {car, truck};
);
context.UpdateProgress += (sender, e) =>
{
    // Do your update here
};

var query = from item in context
            where item.car.color==item.truck.color;

// This will trigger the updates
query.ToArray();

The only issue is you can't easily do a percentage unless you know the total count. To do a total count often requires processing the entire list, which can be costly. If you do know the total count beforehand then you can work out a percentage in the UpdateProgress event handler.

Cameron MacFarland
I really like this approach. My LINQ expression consists of several nested from's - I could put this around the outer-most list to at least get a gross progress indicator (which is probably all I need anyway as I do not want to incur the overhead of updating too frequently.)
caryden