views:

166

answers:

5

I've got an IEnumerable<T> containing a list of data elements with consistent intervals in one of the properties:

List<Interval> list = new List<Interval>
            { 
                new Interval{ TIME_KEY = 600},
                new Interval{ TIME_KEY = 605},
                new Interval{ TIME_KEY = 615},
                new Interval{ TIME_KEY = 620},
                new Interval{ TIME_KEY = 630}
            };

How can I query this list (using Linq, preferably), to get a List that looks like this:

 List<Interval> list = new List<Interval>
                { 
                    new Interval{ TIME_KEY = 610},
                    new Interval{ TIME_KEY = 625}
                };

?

EDIT: I will probably know what the interval distance is supposed to be, but if there's a way to determine it by examing the data, that would be a huge bonus!

EDIT: changed to numeric values

+3  A: 

Have a look at this question for an extension method which selects consecutive values. From there, you could do something like:

// I'd probably rename SelectBetween to SelectConsecutive
list.SelectConsecutive((x, y) => new { Original = x, Interval = y - x})
    .Where(pair => pair.Interval != 5)
    .Select(pair => new Interval(pair.Original + 5));

(Somewhat pseudocode, but I hope you see where I'm going.)

However, that would only generate one element when it's missing... if you went from 0 to 20, it wouldn't generate 5, 10, 15.

To put some meat on Henk's second suggestion:

var missing = Enumerable.Range(0, expectedElementCount)
                        .Select(x => new Interval(baseInterval + 5 * x)
                        .Except(list);
Jon Skeet
Isn't SelectConsecutive much like scan from Haskell? If so can't it implemented with Zip and Skip: `xs.Zip(xs.Skip(1), f)`?
Martinho Fernandes
In your second code snippet, I'm having a hard time figuring out what "base" is supposed to represent (it's also a reserved keyword). Can you please clarify a litte?
Chris McCall
I guess `base` is the first element in the list, aka `var @base = list.First()`.
Martinho Fernandes
@Martinho: Yes, the difference being that SelectConsecutive will only scan the list once. I don't like going through it twice which Zip/Skip does.
Jon Skeet
@Chris: Yes, base was the "base" interval from which everything else is built. Renamed it to avoid the keyword clash :)
Jon Skeet
This has to be the first time I don't like your answer, maybe it's my feeble mind but I think it can be done more readably/simply. It's like watching your favorite superhero get beaten by a drunken mugger, a sad day..
Jimmy Hoffa
@Jimmy: I think the second approach is reasonably simple, isn't it? I must admit I don't quite understand your answer :(
Jon Skeet
@Jon Skeet: Aye, my answer may not be the easiest to understand either, it's just what came to mind immediately after realizing Henk already mentioned doing a non-linq foreach which I would think the simplest/most readable way of identifying the missing interval elements. As for the second part of your answer, I really don't understand either parts of your answer
Jimmy Hoffa
@Jimmy: I believe the second part of my answer is broadly like yours - generate the complete list, then remove existing ones. The first part is meant to spot missing elements just by comparing consecutive elements.
Jon Skeet
+1  A: 
var newList = 
     Enumerable.Range(0, 6)
               .Select(r=> new Interval() {TIME_KEY = ((r*5)+600) })
               .Except(list )
James Curran
+1  A: 

This would work if the interval is known, if you have access to the Zip method (comes with .NET 4):

list.Zip(list.Skip(1), (x,y) => new { x, delta = y - x })
    .SelectMany(a => Enumerable.Range(1, a.delta/interval - 1)
                               .Select(i => a.x + i*interval));

Note that this iterates the list twice so in case the source is a lazy enumerable, you need to buffer it first. That construction with Zip and Skip is a quick and dirty way of projecting consecutive elements together. Reactive Extensions' System.Interactive library has a Scan method for that and Jon showed a possible implementation in another answer. Neither of those iterates the list twice, so they would be a much better choice.

If the interval is to be determined you can get the minimum delta:

var deltas = list.Zip(list.Skip(1), (x,y) => y - x );
var interval = deltas.Min();
list.Zip(deltas, (x, delta) => new { x, delta })
    .SelectMany(a => Enumerable.Range(1, a.delta/interval - 1)
                               .Select(i => a.x + i*interval));

There are some assumptions I made though:

  • all differences between the elements are multiples of the interval;
  • the input is sorted.

How it works:

  1. First it builds a sequence of pairs with each element but the last and the interval to the element that follows it;
  2. Then for each of those pairs, it produces all the missing values within the delta: within each delta there are exactly a.delta/interval - 1 values, and each of these is a certain number of intervals away from the element store in the pair, hence a.x + i*interval.
  3. SelectMany takes care of flattening all those sequences of missing values together into one.
Martinho Fernandes
Your solution is good, but it will only work if the list is sorted (as in the example).
Gabe
@Gabe: nice point. Will mention that.
Martinho Fernandes
A: 

Try this:

private static IEnumerable<Interval> CalculateMissingIntervals(IEnumerable<Interval> list, int step) {
  return list.Zip(list.Skip(1), 
    (i1, i2) => IntervalRange(i1.TIME_KEY + step, i2.TIME_KEY, step)).
    SelectMany(x => x);
}
private static IEnumerable<Interval> IntervalRange(int start, int end, int step) {
  for (var i = start; i < end; i += step) {
    yield return new Interval { TIME_KEY = i };
  }
}

The initial list is assumed to be sorted.

Jordão
A: 
IEnumerable<Interval> missingIntervals =
    Enumerable.Range(list.Min(listMember => listMember.TIME_KEY), list.Max(listMember => listMember.TIME_KEY))
              .Where(enumMember => enumMember % intervalDistance == 0)
              .Select(enumMember => new Interval { TIME_KEY = enumMember}
              .Except(list);
Jimmy Hoffa