views:

862

answers:

7

If I have an IEnumerable like:

string[] items = new string[] { "a", "b", "c", "d" };

I would like to loop thru all the pairs of consecutive items (sliding window of size 2). Which would be

("a","b"), ("b", "c"), ("c", "d")

My solution was is this

    public static IEnumerable<Pair<T, T>> Pairs(IEnumerable<T> enumerable) {
        IEnumerator<T> e = enumerable.GetEnumerator(); e.MoveNext();
        T current = e.Current;
        while ( e.MoveNext() ) {
            T next = e.Current;
            yield return new Pair<T, T>(current, next);
            current = next;
        }
    }

 // used like this :
 foreach (Pair<String,String> pair in IterTools<String>.Pairs(items)) {
    System.Out.PrintLine("{0}, {1}", pair.First, pair.Second)
 }

When I wrote this code, I wondered if there are already functions in the .NET framework that do the same thing and do it not just for pairs but for any size tuples. IMHO there should be a nice way to do this kind of sliding window operations.

I use C# 2.0 and I can imagine that with C# 3.0 (w/ LINQ) there are more (and nicer) ways to do this, but I'm primarily interested in C# 2.0 solutions. Though, I will also appreciate C# 3.0 solutions.

+1  A: 

C# 3.0 solution (sorry:)

public static IEnumerable<IEnumerable<T>> Tuples<T>(this IEnumerable<T> sequence, int nTuple)
{
    if(nTuple <= 0) throw new ArgumentOutOfRangeException("nTuple");

    for(int i = 0; i <= sequence.Count() - nTuple; i++)
        yield return sequence.Skip(i).Take(nTuple);
}

This isn't the most performant in the world, but it's sure pleasant to look at.

Really, the only thing making this a C# 3.0 solution is the .Skip.Take construct, so if you just change that to adding the elements in that range to a list instead, it should be golden for 2.0. That said, it's still not performant.

mquander
Not the most performant? This is an O(n*n) implementation! For a small 10 item list, the entire list was traversed 20 times
configurator
That's true, but it's also two lines of (real) code and is an obviously simple implementation. It's up to the OP to decide whether he needs a fast solution -- maybe he only needs this operation on lists of a couple dozen items.
mquander
+3  A: 

Expanding on the previous answer to avoid of O(n2) approach by explicitly using the passed iterator:

public static IEnumerable<IEnumerable<T>> Tuples<T>(this IEnumerable<T> input, int groupCount) {
  if (null == input) throw new ArgumentException("input");
  if (groupCount < 1) throw new ArgumentException("groupCount");

  var e = input.GetEnumerator();

  bool done = false;
  while (!done) {
    var l = new List<T>();
    for (var n = 0; n < groupCount; ++n) {
      if (!e.MoveNext()) {
        if (n != 0) {
          yield return l;
        }
        yield break;
      }
      l.Add(e.Current);
    }
    yield return l;
  }
}

For C# 2, where iterators are allowed, drop the "this" from the input parameter and call as a static method.

Richard
A: 

Alternate Pairs implementation, using last pair to store previous value:

static IEnumerable<Pair<T, T>> Pairs( IEnumerable<T> collection ) {
  Pair<T, T> pair = null;
  foreach( T item in collection ) {
    if( pair == null )
      pair = Pair.Create( default( T ), item );
    else
      yield return pair = Pair.Create( pair.Second, item );
  }
}

Simple Window implementation (only safe for private use, if caller does not save returned arrays; see note):

static IEnumerable<T[]> Window( IEnumerable<T> collection, int windowSize ) {
  if( windowSize < 1 )
    yield break;

  int index = 0;
  T[] window = new T[windowSize];
  foreach( var item in collection ) {
    bool initializing = index < windowSize;

    // Shift initialized window to accomodate new item.
    if( !initializing )
      Array.Copy( window, 1, window, 0, windowSize - 1 );

    // Add current item to window.
    int itemIndex = initializing ? index : windowSize - 1;
    window[itemIndex] = item;

    index++;
    bool initialized = index >= windowSize;
    if( initialized )
      //NOTE: For public API, should return array copy to prevent 
      // modifcation by user, or use a different type for the window.
      yield return window;
  }
}

Example use:

for( int i = 0; i <= items.Length; ++i ) {
  Console.WriteLine( "Window size {0}:", i );
  foreach( string[] window in IterTools<string>.Window( items, i ) )
    Console.WriteLine( string.Join( ", ", window ) );
  Console.WriteLine( );
}
Emperor XLII
+7  A: 

Rather than require a tuple (pair) type, why not just accept a selector:

public static IEnumerable<TResult> Pairwise<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TSource, TResult> resultSelector)
{
    TSource previous = default(TSource);

    using (var it = source.GetEnumerator())
    {
        if (it.MoveNext())
            previous = it.Current;

        while (it.MoveNext())
            yield return resultSelector(previous, previous = it.Current);
    }
}

Which allows you to skip the intermediate object if you want:

string[] items = new string[] { "a", "b", "c", "d" };
var pairs = items.Pairwise((x, y) => string.Format("{0},{1}", x, y));

foreach(var pair in pairs)
    Console.WriteLine(pair);

Or you can use an anonymous type:

var pairs = items.Pairwise((x, y) => new { First = x, Second = y });
dahlbyk
A: 

The F# Seq module defines the pairwise function over IEnumerable<T>, but this function is not in the .NET framework.

If it were already in the .NET framework, instead of returning pairs it would probably accept a selector function due to the lack of support for tuples in languages like C# and VB.

var pairs = ns.Pairwise( (a, b) => new { First = a, Second = b };

I don't think any of the answers here really improve on your simple iterator implementation, which seemed the most natural to me (and the poster dahlbyk by the looks of things!) too.

Pete Montgomery
A: 

Something like this:

public static IEnumerable<TResult> Pairwise<T, TResult>(this IEnumerable<T> enumerable, Func<T, T, TResult> selector)
{
    var previous = enumerable.First();
    foreach (var item in enumerable.Skip(1))
    {
        yield return selector(previous, item);
        previous = item;
    }
}
Quiz
+1  A: 

In .NET 4 this becomes even easier:-

        var input = new[] { "a", "b", "c", "d", "e", "f" };
        var result = input.Zip(input.Skip(1), (a, b) => Tuple.Create(a, b));
Hightechrider