ansaurus

Question

Can I use LINQ to retrieve only "on change" values?

Answer 1

+2 A:

You could use the IEnumerable extension that takes an index.

var all = ds.Tables[0].AsEnumerable();
var weatherStuff = all.Where( (w,i) => i == 0 || w.Field<string>("Observation") != all.ElementAt(i-1).Field<string>("Observation") );

tvanfosson 2010-02-03 19:29:44

Ah - good answer, I hadn't thought of that. One caveat though is that if your `IEnumerable` doesn't actually have indexed access like `List<T>`, then I think the performance will be O(N²).

Aaronaught 2010-02-03 19:32:22

Noted -- I'm not sure what the EnumerableRowCollection's underlying storage mechanism is. I'd suspect that it's array-based, though.

tvanfosson 2010-02-03 19:52:21

Thanks for the response. Wouldn't element 0 NOT make the list if it and element 1 were different? It would seem that it would be skipped over as written. Not complaining, mind you, just trying to understand. I've still got a lot to learn about LINQ. It isn't a "natural" thing for me at this point. Your idea didn't seem to work for me as written - not sure why but I'm still messing with it at this point. Thanks again.

itsmatt 2010-02-05 11:42:14

I read your question as only "changed" values -- those different than the previous. If you wan to include element 0, then the condition is just different. I'll update.

tvanfosson 2010-02-05 11:47:43

Answer 2

A:

This is one of those instances where the iterative solution is actually better than the set-based solution in terms of both readability and performance. All you really want Linq to do is filter and pre-sort the list if necessary to prepare it for the loop.

It is possible to write a query in SQL Server (or various other databases) using windowing functions (ROW_NUMBER), if that's where your data is coming from, but very difficult to do in pure Linq without making a much bigger mess.

If you're just trying to clean the code up, an extension method might help:

public static IEnumerable<T> Changed(this IEnumerable<T> items,
    Func<T, T, bool> equalityFunc)
{
    if (equalityFunc == null)
    {
        throw new ArgumentNullException("equalityFunc");
    }
    T last = default(T);
    bool first = true;
    foreach (T current in items)
    {
        if (first || !equalityFunc(current, last))
        {
            yield return current;
        }
        last = current;
        first = false;
    }
}

Then you can call this with:

var changed = rows.Changed((r1, r2) =>
    r1.Field<string>("Observation") == r2.Field<string>("Observation"));

Aaronaught 2010-02-03 19:31:21

Here we go again. How was this answer wrong/misleading/unhelpful?

Aaronaught 2010-02-03 19:36:15

Thanks for the idea here. I'm going to try it out and see how your idea works. I do agree with your initial statement at least about the readability part (I can't speak intelligently on the performance aspects at this point). LINQ to me is still one of those things that feels a bit odd and awkward to write. I suspect that is mostly due to my lack of experience with it.

itsmatt 2010-02-05 11:51:13

Answer 3

A:

I think what you are trying to accomplish is not possible using the "syntax suggar". However it could be possible using the extension method Select that pass the index of the item you are evaluating. So you could use the index to compare the current item with the previous one (index -1).

Carlos Loth 2010-02-03 19:32:29

Answer 4

+2 A:

Here is one more general thought that may be intereting. It's more complicated than what @tvanfosson posted, but in a way, it's more elegant I think :-). The operation you want to do is to group your observations using the first field, but you want to start a new group each time the value changes. Then you want to select the first element of each group.

This sounds almost like LINQ's group by but it is a bit different, so you can't really use standard group by. However, you can write your own version (that's the wonder of LINQ!). You can either write your own extension method (e.g. GroupByMoving) or you can write extension method that changes the type from IEnumerable to some your interface and then define GroupBy for this interface. The resulting query will look like this:

var weatherStuff = 
  from row in ds.Tables[0].AsEnumerable().AsMoving()
  group row by row.Field<string>("Observation") into g
  select g.First();

The only thing that remains is to define AsMoving and implement GroupBy. This is a bit of work, but it is quite generally useful thing and it can be used to solve other problems too, so it may be worth doing it :-). The summary of my post is that the great thing about LINQ is that you can customize how the operators behave to get quite elegant code.

I haven't tested it, but the implementation should look like this:

// Interface & simple implementation so that we can change GroupBy
interface IMoving<T> : IEnumerable<T> { }
class WrappedMoving<T> : IMoving<T> {
  public IEnumerable<T> Wrapped { get; set; }
  public IEnumerator<T> GetEnumerator() { 
    return Wrapped.GetEnumerator(); 
  }
  public IEnumerator<T> GetEnumerator() { 
    return ((IEnumerable)Wrapped).GetEnumerator(); 
  }
}

// Important bits:
static class MovingExtensions { 
  public static IMoving<T> AsMoving<T>(this IEnumerable<T> e) {
    return new WrappedMoving<T> { Wrapped = e };
  }

  // This is (an ugly & imperative) implementation of the 
  // group by as described earlier (you can probably implement it
  // more nicely using other LINQ methods)
  public static IEnumerable<IEnumerable<T>> GroupBy<T, K>(this IEnumerable<T> source, 
       Func<T, K> keySelector) {
    List<T> elementsSoFar = new List<T>();
    IEnumerator<T> en = source.GetEnumerator();
    if (en.MoveNext()) {
      K lastKey = keySelector(en.Current);
      do { 
        K newKey = keySelector(en.Current);
        if (newKey != lastKey) { 
          yield return elementsSoFar;
          elementsSoFar = new List<T>();
        }
        elementsSoFar.Add(en.Current);
      } while (en.MoveNext());
      yield return elementsSoFar;
    }
  }

Tomas Petricek 2010-02-04 00:50:39

Thanks, Tomas. That is an interesting approach, though it certainly is longer (and seemingly more complex) than the way it is currently implemented. I haven't tried it out yet, but will and appreciate your taking the time to post your idea.

itsmatt 2010-02-05 11:46:34

ansaurus

tags:

views:

answers:

Can I use LINQ to retrieve only "on change" values?

related questions