views:

161

answers:

7

I've got a list of timestamps (in ticks), and from this list I'd like to create another one that represents the delta time between entries.

Let's just say, for example, that my master timetable looks like this:

  1. 10
  2. 20
  3. 30
  4. 50
  5. 60
  6. 70

What I want back is this:

  1. 10
  2. 10
  3. 20
  4. 10
  5. 10

What I'm trying to accomplish here is detect that #3 in the output table is an outlier by calculating the standard deviation. I've not taken statistics before, but I think if I look for the prevalent value in the output list and throw out anything outside of 1 sigma that this will work adequately for me.

I'd love to be able to create the output list with a single LINQ query, but I haven't figured it out yet. Currently I'm just brute forcing it with a loop.

A: 

LINQ is not really designed for what you're trying to do here, because it usually evaluates value by value, much like an extremely efficient combination of for-loops. You'd have to know your current index, something you don't, without some kind of workaround.

Joachim VR
You certainly do know your current index, it's just bad if you're working with a collection without an index accessor (e.g. IEnumerable<T>).
Iain Galloway
+6  A: 

If you are running .NET 4.0, this should work fine:

var deltas = list.Zip(list.Skip(1), (current, next) => next - current);

Apart from the multiple enumerators, this is quite efficient; it should work well on any kind of sequence.

Here's an alternative for .NET 3.5:

var deltas = list.Skip(1)
                 .Select((next, index) => next - list[index]);

Obviously, this idea will only be efficient when the list's indexer is employed. Modifying it to use ElementAt may not be a good idea: quadratic run-time will occur for non IList<T> sequences. In this case, writing a custom iterator is a good solution.

EDIT: If you don't like the Zip + Skip(1) idea, writing an extension such as this (untested) maybe useful in these sorts of circumstances:

public class CurrentNext<T>
{
    public T Current { get; private set; }
    public T Next { get; private set; }

    public CurrentNext(T current, T next)
    {
        Current = current;
        Next = next;
    }
}

...

public static IEnumerable<CurrentNext<T>> ToCurrentNextEnumerable<T>(this IEnumerable<T> source)
{
    if (source == null)
        throw new ArgumentException("source");

    using (var source = enumerable.GetEnumerator())
    {
        if (!enumerator.MoveNext())
            yield break;

        T current = enumerator.Current;

        while (enumerator.MoveNext())
        {
            yield return new CurrentNext<T>(current, enumerator.Current);
            current = enumerator.Current;
        }
    }
}

Which you could then use as:

var deltas = list.ToCurrentNextEnumerable()
                 .Select(c=> c.Next - c.Current);
Ani
+1. Brilliant!! Of course this is .NET 4.0. Don't forget you will miss the last element in the list. You will need to prepend 0 to the `list.Skip(1)` list.
Steven
@Steven: Could you explain that? I don't see how. Count(deltas) should be Count(sourceEnumerable) - 1, no?
Ani
I'm using .NET 3.5. :( but thanks for the 3.5 equivalent! :)
Dave
To be honest, in 3.5 I'd write my own extension method rather than use the index accessor inside the LINQ. If you really want, though, you can get the source code for the Zip extension and just use that in 3.5 - http://blogs.msdn.com/b/ericwhite/archive/2009/07/05/comparing-two-open-xml-documents-using-the-zip-extension-method.aspx
Iain Galloway
@Ani: The `Enumerable.Zip` extension method will return a sequence that has the same number of elements as the shortest supplied sequence (see MSDN documentation). Because you do `Skip(1)`, the second sequence will be one shorter than the first. This means that you will miss one element.
Steven
@Steven - The OP has 6 elements in the source list and 5 in the output. Typically, the differences between consecutive entities will be one fewer than the original entities. A five-story building is only going to have four flights of stairs.
Jeffrey L Whitledge
@Ani, @Jeffrey: You are right, the sequence has to be one shorter.
Steven
+3  A: 

This should do the trick:

static IEnumerable<int> GetDeltas(IEnumerable<int> collection)
{
    int? previous = null;

    foreach (int value in collection)
    { 
        if (previous != null)
        {
            yield return value - (int)previous;
        }
        previous = value;
    }
}

Now you can call your collection like this:

var masterTimetable = GetMasterTimeTable();

var deltas = GetDeltas(masterTimetable);

It's not really LINQ, but will effectively do the trick.

Steven
Note that this will also give the difference between the 1st item and 0, which is not exactly what the OP asked for.
Justin
@Steven thanks! but don't I also need to ignore / Skip() the first entry in the output IEnumerable<> that your code generates?
Dave
@Dave: I now see that I don't understand how you want to calculate the deltas. Still using an iterator method will be very effective. Especially when getting the results as stream directly from the database or another source such as file system. A solution with `list.Zip(list.Skip(1))` will call the database twice.
Steven
@Steven I used Ani's approach with success. I liked yours as well, but the issue that I *think* it has is that you check the delta between the first element and previous (which is 0), but I want the first delta value to be value[1] - value[0].
Dave
@Dave: Now I see. I fixed the bug :-)
Steven
A: 

Not that I recommend this, but totally abusing LINQ the following would work:

var vals = new[] {10, 20, 30, 50, 60, 70};

int previous = 0;
var newvals = vals.Select(i =>
                            {
                                int dif = i - previous;
                                previous = i;
                                return dif;
                            });
foreach (var newval in newvals)
{
    Console.WriteLine(newval);
}
Davy8
+2  A: 

You can use Ani's answer:-

var deltas = list.Zip(list.Skip(1), (current, next) => next - current);

With a super-simple implementation of the Zip extension method:-

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
  this IEnumerable<TFirst> first,
  IEnumerable<TSecond> second,
  Func<TFirst, TSecond, TResult> func)
{
  var ie1 = first.GetEnumerator();
  var ie2 = second.GetEnumerator();

  while (ie1.MoveNext() && ie2.MoveNext())
    yield return func(ie1.Current, ie2.Current);
}

That'll work with 3.5.

Iain Galloway
+1  A: 

It looks like there are sufficient answers to get you going already, but I asked a similar question back in the spring:

How to zip one ienumerable with itself

In the responses to my question, I learned about "Pairwise" and "Pairwise"

As I recall, explicitly implementing your own "Pairwise" enumerator does mean that you iterate through you list exactly once whereas implementing "Pairwise" in terms of .Zip + .Skip(1) means that you will ultimately iterate over your list twice.

In my post I also include several examples of geometry (operating on lists of points) processing code such as Length/Distance, Area, Centroid.

wageoghe
A: 

One liner for you:

int[] i = new int[] { 10, 20, 30, 50, 60, 70 };
IEnumerable<int> x = Enumerable.Range(1, i.Count()-1).Select(W => i[W] - i[W - 1]);
Vivek