tags:

views:

173

answers:

4

Suppose I have 2 enumerations that I know have the same number of elements and each element "corresponds" with the identically placed element in the other enumeration. Is there a way to process these 2 enumerations simultaneously so that I have access to the corresponding elements of each enumeration at the same time?

Using a theoretical LINQ syntax, what I have in mind is something like:

from x in seq1, y in seq2
select new {x.foo, y.bar}
A: 

Update:

Eric Lippert recently posted on this: http://blogs.msdn.com/ericlippert/archive/2009/05/07/zip-me-up.aspx

It's especially interesting because he's posted the source for the new extension in C#4:

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>
    (this IEnumerable<TFirst> first, 
    IEnumerable<TSecond> second, 
    Func<TFirst, TSecond, TResult> resultSelector) 
{
    if (first == null) throw new ArgumentNullException("first");
    if (second == null) throw new ArgumentNullException("second");
    if (resultSelector == null) throw new ArgumentNullException("resultSelector");
    return ZipIterator(first, second, resultSelector);
}

private static IEnumerable<TResult> ZipIterator<TFirst, TSecond, TResult>
    (IEnumerable<TFirst> first, 
    IEnumerable<TSecond> second, 
    Func<TFirst, TSecond, TResult> resultSelector) 
{
    using (IEnumerator<TFirst> e1 = first.GetEnumerator())
        using (IEnumerator<TSecond> e2 = second.GetEnumerator())
            while (e1.MoveNext() && e2.MoveNext())
                yield return resultSelector(e1.Current, e2.Current);
}


Original answer:

Are you referring to a join?

from x in seq1
join y in seq2
on x.foo equals y.foo
select new {x, y}

There is also pLinq - which executes linq statements in parallel (across multiple threads).


Edit:

Ah - thanks for clarifying the question, though I really don't think my answer deserved a vote down.

It sounds like what you want is something like:

from x in seq1
join y in seq2
on x.Index equals y.Index
select new {x.Foo, y.Bar}

Unfortunately you can't do that with Linq - it extends IEnumerable, which only really has current and next properties, so no index property.

Obviously you can do this easily in C# with a nested for-loop and an if block, but you can't with Linq I'm afraid.

The only way to mimic this in linq syntax is to artificially add the index:

int counter = 0;
var indexed1 = (
    from x in seq1
    select { item = x, index = counter++ } ).ToList();
//note the .ToList forces execution, this won't work if lazy

counter = 0;
var indexed2 = (
    from x in seq2
    select { item = x, index = counter++ } ).ToList();

var result = 
    from x in indexed1 
    join y in indexed2
    on x.index = y.index
    select new {x.item.Foo, y.item.Bar}
Keith
Close, but what if there's no matching field?
Joel Coehoorn
I think the question is about two sequences where there's not an identifier to join them, but rather their position in the sequences joins them. Ie. the first element of each go together, etc. I actually don't know of a way to do this, short of a custom extension method taking an Action<T1,T2> and enumerating through both sequences.
Jonathan
no... x.foo and y.foo are NOT equal. They just correspond to each other... I will change my question to use "foo" and "bar" to make it clearer
JoelFan
+3  A: 

The function you are looking for is called "Zip". It works like a zipper. It'll be in .NET 4.0 iirc. In the meantime you may want to look at the BclExtras library. (Man, I'm a real advocate for this lib, lol).

IEnumerable<Tuple<TSeq1, TSeq2>> tuples = from t in seq1.Zip(seq2)
                                          select t;

If you just want to get done, you'll have to get both sequences enumerator and run them "in parallel" using a traditional loop.

Jabe
This seems to be the answer to entirely too many situations I'm hitting right now -- "wait for .Net 4.0"....
Jonathan
A: 

There's a "Zip" method being added in 4.0 that addresses this (like a zipper, zipping up adjacent elements.) Until then, the most readable (albeit not most performant) way would probably be something like this, unless lazy evaluation is really crucial:

var indexedA = seqA.ToArray();
var indexedB = seqB.ToArray();

for(int i = 0; i < indexedA.Length && i < indexedB.Length; i++)
{
    var thisA = indexedA[i];
    var thisB = indexedB[i];
    // whatever
}
mquander
if you want to convert it to something performant you could do 2 GetEnumerator's at the beginning and loop through with MoveNext and Current
JoelFan
That's quite true.
mquander
+3  A: 

Since Neil Williams deleted his answer, I'll go ahead and post a link to an implementation by Jon Skeet.

To paraphrase the relevant portion:

public static IEnumerable<KeyValuePair<TFirst,TSecond>> Zip<TFirst,TSecond>
    (this IEnumerable<TFirst> source, IEnumerable<TSecond> secondSequence)
{
    using (IEnumerator<TSecond> secondIter = secondSequence.GetEnumerator())
    {
        foreach (TFirst first in source)
        {
            if (!secondIter.MoveNext())
            {
                throw new ArgumentException
                    ("First sequence longer than second");
            }
            yield return new KeyValuePair<TFirst, TSecond>(first, secondIter.Current);
        }
        if (secondIter.MoveNext())
        {
            throw new ArgumentException
                ("Second sequence longer than first");
        }
    }        
}

Note that the KeyValuePair<> is my addition, and that I'm normally not a fan of using it this way. Instead, I would define a generic Pair or Tuple type. However, they are not included in the current version of the framework and I didn't want to clutter this sample with extra class definitions.

Joel Coehoorn
Community Wiki because I Neil Williams really deserved credit for posting the link, even if he did delete his answer.
Joel Coehoorn