tags:

views:

124

answers:

4

All of the examples for SelectMany I see are flattening arrays of arrays and so on. I have a different angle on this question.

I have an array of a type, and I want to extract that type's contents into a stream. Here's my example code:

public class MyClass
{
    class Foo
    {
        public int X, Y;
    }

    static IEnumerable<int> Flatten(Foo foo)
    {
        yield return foo.X;
        yield return foo.Y;
    }

    public static void RunSnippet()
    {
        var foos = new List<Foo>()
            {
                new Foo() { X = 1, Y = 2 },
                new Foo() { X = 2, Y = 4 },
                new Foo() { X = 3, Y = 6 },
            };

        var query = foos.SelectMany(x => Flatten(x));
        foreach (var x in query)
        {
            Console.WriteLine(x);
        }
    }
}

This outputs what I'd like: 1, 2, 2, 4, 3, 6.

Can I eliminate the yields? I know that the plumbing to support that is nontrivial, and probably has a significant cost. Is it possible to do it all in linq?

I feel like I'm very close to the answer and am just missing the magic keyword to search on. :)

UPDATE:

As mentioned in the answer below, it works to use something like this:

foos.SelectMany(x => new[] { x.X, x.Y });

However, I was hoping to find a way to do this without generating n/2 temporary arrays. I'm running this against a large selection set.

+1  A: 

You could do this:

var query = foos.SelectMany(x => new[] { x.X, x.Y });
Mark Byers
Oops - I forgot to mention that I don't want to do it this way. Trying to avoid the temporary array creation. Will update my question.
Scott Bilas
A: 

This kind of reverses IEnumerable<T>, and is more comparable to what we did with PushLINQ - but it is a lot simpler than implementing an iterator block on the fly (through IL), while retaining blinding performance thanks to dynamic-method; the use of object is in case your data is non-orthogonal and you need multiple types through the same API:

using System;
using System.Reflection;
using System.Reflection.Emit;

// the type we want to iterate efficiently without hard code
class Foo
{
    public int X, Y;
}
// what we want to do with each item of data
class DemoPusher : IPusher<int>
{
    public void Push(int value)
    {
        Console.WriteLine(value);
    }
}
// interface for the above implementation
interface IPusher<T>
{
    void Push(T value);
}
static class Program
{
    // see it working
    static void Main()
    {
        Foo foo = new Foo { X = 1, Y = 2 };
        var target = new DemoPusher();
        var pushMethod = CreatePusher<int>(typeof(Foo));
        pushMethod(foo, target);       
    }
    // here be dragons
    static Action<object, IPusher<T>> CreatePusher<T>(Type source)
    {
        DynamicMethod method = new DynamicMethod("pusher",
            typeof(void), new[] { typeof(object), typeof(IPusher<T>) }, source);
        var il = method.GetILGenerator();
        var loc = il.DeclareLocal(source);
        il.Emit(OpCodes.Ldarg_0);
        il.Emit(OpCodes.Castclass, source);
        il.Emit(OpCodes.Stloc, loc);
        MethodInfo push = typeof(IPusher<T>).GetMethod("Push");
        foreach (var field in source.GetFields(BindingFlags.Instance
            | BindingFlags.Public | BindingFlags.NonPublic))
        {
            if (field.FieldType != typeof(T)) continue;
            il.Emit(OpCodes.Ldarg_1);
            il.Emit(OpCodes.Ldloc, loc);
            il.Emit(OpCodes.Ldfld, field);
            il.EmitCall(OpCodes.Callvirt, push, null);
        }
        il.Emit(OpCodes.Ret);
        return (Action<object, IPusher<T>>)
            method.CreateDelegate(typeof(Action<object, IPusher<T>>));
    }

}
Marc Gravell
+2  A: 

If you're worried about the cost of the compiler trickery involved with yield and/or the cost of SelectMany, you could try to minimize the impact of those by not calling Flatten on each Foo but instead Flatten the foos directly:

public class MyClass
{
    class Foo
    {
        public int X, Y;
    }

    static IEnumerable<int> Flatten(IEnumerable<Foo> foos)
    {
        foreach (var foo in foos)
        {
            yield return foo.X;
            yield return foo.Y;
        }
    }

    public static void RunSnippet()
    {
        var foos = new List<Foo>()
        {
            new Foo() { X = 1, Y = 2 },
            new Foo() { X = 2, Y = 4 },
            new Foo() { X = 3, Y = 6 },
        };

        var query = Flatten(foos);

        foreach (var x in query)
        {
            Console.WriteLine(x);
        }
    }
}

I've run a small test app for this and I have seen that there are some performance benefits with the second implementation. On my machine, flattening 100,000 Foos with both algorithms took 36ms and 13ms, respectively. As always YMMV.

Lette
+1  A: 

Well, if you want to avoid temporary array creation and yet you want short and nice code using LINQ, you can go with -

var query = foos.Aggregate(
    new List<int>(), 
    (acc, x) => { acc.Add(x.X); acc.Add(x.Y); return acc; }
    );
Matajon
Yes! This is the kind of thing I was looking for.
Scott Bilas