ansaurus

Question

Answer 1

+2 A:

If you're really using arrays, the best way is probably just to use the conventional for loop with indices. Not as nice, granted, but as far as I know .NET doesn't offer a better way of doing this.

You could also encapsulate your code into a method called zip – this is a common higher-order list function. However, C# lacking a suitable Tuple type, this is quite crufty. You'd end up returning an IEnumerable<KeyValuePair<T1, T2>> which isn't very nice.

By the way, are you really using IEnumerable instead of IEnumerable<T> or why do you cast the Current value?

Konrad Rudolph 2009-01-30 18:55:39

Answer 2

+11 A:

if there are the same number of column names as there are elements in each row, could you not use a for loop?

var currentValues = currentRow.Split(separatorChar);

for(var i=0;i<columnList.Length;i++){
   // use i to index both (or all) arrays and build your map
}

inkedmn 2009-01-30 18:56:00

I would recommend a check to see if the two arrays length are the same before entering this loop.

James McMahon 2009-01-30 19:12:50

I second nemo's comment.

epochwolf 2009-01-30 19:21:15

while I didn't include that check in the actual code sample, I did say that at the beginning of my response ;)Point taken, though!

inkedmn 2009-01-30 19:24:18

Answer 3

+2 A:

Instead of creating two seperate arrays you could make a two-dimensional array, or a dictionary (which would be better). But really, if it works I wouldn't try to change it.

Jared 2009-01-30 18:56:24

Answer 4

A:

You could create a templated enumerator a la Pairenumerable. Personally I think that's overkill.

Brian 2009-01-30 19:00:33

Answer 5

+12 A:

You've got a non-obvious pseudo-bug in your initial code - IEnumerator<T> extends IDisposable so you should dispose it. This can be very important with iterator blocks! Not a problem for arrays, but would be with other IEnumerable<T> implementations.

I'd do it like this:

public static IEnumerable<TResult> PairUp<TFirst,TSecond,TResult>
    (this IEnumerable<TFirst> source, IEnumerable<TSecond> secondSequence,
     Func<TFirst,TSecond,TResult> projection)
{
    using (IEnumerator<TSecond> secondIter = secondSequence.GetEnumerator())
    {
        foreach (TFirst first in source)
        {
            if (!secondIter.MoveNext())
            {
                throw new ArgumentException
                    ("First sequence longer than second");
            }
            yield return projection(first, secondIter.Current);
        }
        if (secondIter.MoveNext())
        {
            throw new ArgumentException
                ("Second sequence longer than first");
        }
    }        
}

Then you can reuse this whenever you have the need:

foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar),
             (column, value) => new { column, value })
{
    // Do something
}

Alternatively you could create a generic Pair type, and get rid of the projection parameter in the PairUp method.

EDIT:

With the Pair type, the calling code would look like this:

foreach (var pair in columnList.PairUp(currentRow.Split(separatorChar))
{
    // column = pair.First, value = pair.Second
}

That looks about as simple as you can get. Yes, you need to put the utility method somewhere, as reusable code. Hardly a problem in my view. Now for multiple arrays...

If the arrays are of different types, we have a problem. You can't express an arbitrary number of type parameters in a generic method/type declaration - you could write versions of PairUp for as many type parameters as you wanted, just like there are Action and Func delegates for up to 4 delegate parameters - but you can't make it arbitrary.

If the values will all be of the same type, however - and if you're happy to stick to arrays - it's easy. (Non-arrays is okay too, but you can't do the length checking ahead of time.) You could do this:

public static IEnumerable<T[]> Zip<T>(params T[][] sources)
{
    // (Insert error checking code here for null or empty sources parameter)

    int length = sources[0].Length;
    if (!sources.All(array => array.Length == length))
    {
        throw new ArgumentException("Arrays must all be of the same length");
    }

    for (int i=0; i < length; i++)
    {
        // Could do this bit with LINQ if you wanted
        T[] result = new T[sources.Length];
        for (int j=0; j < result.Length; j++)
        {
             result[j] = sources[j][i];
        }
        yield return result;
    }
}

Then the calling code would be:

foreach (var array in Zip(columns, row, whatevers))
{
    // column = array[0]
    // value = array[1]
    // whatever = array[2]
}

This involves a certain amount of copying, of course - you're creating an array each time. You could change that by introducing another type like this:

public struct Snapshot<T>
{
    readonly T[][] sources;
    readonly int index;

    public Snapshot(T[][] sources, int index)
    {
        this.sources = sources;
        this.index = index;
    }

    public T this[int element]
    {
        return sources[element][index];
    }
}

This would probably be regarded as overkill by most though ;)

I could keep coming up with all kinds of ideas, to be honest... but the basics are:

With a little bit of reusable work, you can make the calling code nicer
For arbitrary combinations of types you'll have to do each number of parameters (2, 3, 4...) separately due to the way generics works
If you're happy to use the same type for each part, you can do better

Jon Skeet 2009-01-30 19:01:29

Downvoters - reasons please!

Jon Skeet 2009-01-30 19:04:36

Wow, massive overkill and it still doesn't work with more then two arrays.

Hippiehunter 2009-01-30 19:05:53

You're right that it doesn't work with more than two arrays - but it's elegant and highly reusable for two arrays. I think it may be in .NET 4.0 in fact - it was an operator MS had intended to include in LINQ...

Jon Skeet 2009-01-30 19:07:17

(Or for any two sequences, I should say - not just arrays. And it streams the data, so it can work for potentially infinite data sources.)

Jon Skeet 2009-01-30 19:07:48

its so elegant that it missed his rather specific use

Hippiehunter 2009-01-30 19:08:15

+1 cause it answers OP question perfectly. Plus, I never would have come up with that in a million years

Jared 2009-01-30 19:09:09

@Hippiehunter: In what way does it miss the original specific use? Yes, it doesn't cope with the case of more than two arrays, but other than that it's just what he needs, IMO. I'll address the "more than two arrays" in another edit in 20 minutes. (Putting kids to bed...)

Jon Skeet 2009-01-30 19:18:03

+1: very cool. It looks overkill, but the cost of writing code like this pays for itself if the method is used in several places.

Juliet 2009-01-30 19:19:25

I still think you've crafted the "Incredible Machine"

Hippiehunter 2009-01-30 19:20:16

It looks overly complicated when a single for loop would probably work.

epochwolf 2009-01-30 19:22:44

@epochwolf: If you're only going to use it once, then sure a for loop would be better. That could be said of everything in LINQ though. The point is the reusability. With the Pair type the calling code would be even simpler. Will add to my answer in a bit.

Jon Skeet 2009-01-30 19:29:52

Right, now with lots of added stuff :)

Jon Skeet 2009-01-30 19:48:08

This answer is supercool.

Ed Schwehm 2009-01-30 21:18:13

I admire this for comprehensiveness and extensibility, so +1. But I gave the answer to the simple, retrospectively obvious solution :).

JSBangs 2009-01-30 21:48:19

Answer 6

+1 A:

Use IEnumerator for both would be nice

var currentValues = currentRow.Split(separatorChar);
using (IEnumerator<string> valueEnum = currentValues.GetEnumerator(), columnEnum = columnList.GetEnumerator()) {
    while (valueEnum.MoveNext() && columnEnum.MoveNext())
        valueMap.Add(columnEnum.Current, valueEnum.Current);
}

Or create an extension methods

public static IEnumerable<TResult> Zip<T1, T2, TResult>(this IEnumerable<T1> source, IEnumerable<T2> other, Func<T1, T2, TResult> selector) {
    using (IEnumerator<T1> sourceEnum = source.GetEnumerator()) {
        using (IEnumerator<T2> otherEnum = other.GetEnumerator()) {
            while (sourceEnum.MoveNext() && columnEnum.MoveNext())
                yield return selector(sourceEnum.Current, otherEnum.Current);
        }
    }
}

Usage

var currentValues = currentRow.Split(separatorChar);
foreach (var valueColumnPair in currentValues.Zip(columnList, (a, b) => new { Value = a, Column = b }) {
    valueMap.Add(valueColumnPair.Column, valueColumnPair.Value);
}

chaowman 2009-01-30 19:32:24

Answer 7

+1 A:

In a functional language you would usually find a "zip" function which will hopefully be part of a C#4.0 . Bart de Smet provides a funny implementation of zip based on existing LINQ functions:

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
  this IEnumerable<TFirst> first, 
  IEnumerable<TSecond> second, 
  Func<TFirst, TSecond, TResult> func)
{
  return first.Select((x, i) => new { X = x, I = i })
    .Join(second.Select((x, i) => new { X = x, I = i }), 
    o => o.I, 
    i => i.I, 
    (o, i) => func(o.X, i.X));
}

Then you can do:

  int[] s1 = new [] { 1, 2, 3 };
  int[] s2 = new[] { 4, 5, 6 };
  var result = s1.Zip(s2, (i1, i2) => new {Value1 = i1, Value2 = i2});

flq 2009-01-30 19:48:39

ansaurus

tags:

views:

answers:

How to iterate over two arrays at once?

related questions