views:

139

answers:

5

I am trying to split a collection into multiple collections while maintaining a sort I have on the collection. I have tried using the following extension method, but it breaks them incorrectly. Basically, if I was to look at the items in the collection, the order should be the same when compared to the broken up collections joined. Here is the code I am using that doesn't work:

public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
        {
            int i = 0;
            var splits = from name in list
                         group name by i++ % parts into part
                         select part.AsEnumerable();
            return splits;
        }
  • int parts = number of sub enumerables
A: 

As I understand you want to break enumerable on several parts with equal size and without breaking the order of your elements. It looks like the only choice is to get the length of your input enumerable first, so you would need at least two iterations through the enumerable.

Andrew Bezzub
@Kirk Woll: it depends on what is needed. If "parts" is the number of elements in one enumerable than it is easy to solve. But code above makes me think that "parts" is number of desired sub-enumerables, not elements in the enumerables.
Andrew Bezzub
"parts" is the number of sub-enumerables
Brian David Berman
A: 
    double partLength = list.Count() / (double)parts;

    int i = 0;
    var splits = from name in list
                 group name by Math.Floor((double)(i++ / partLength)) into part
                 select part;
kevev22
In your example, "splits" becomes an IEnumerable<IGrouping<double, T>> but it needs to be an IEnumerable<IEnumerable<T>>
Brian David Berman
Correct me if I am wrong, but isn't IGrouping<double, T> an IEnumerable<T>?
kevev22
A: 

Jon Skeet's MoreLINQ library might do the trick for you:

http://code.google.com/p/morelinq/source/browse/trunk/MoreLinq/Batch.cs

var items = list.Batch(parts);  // gives you IEnumerable<IEnumerable<T>>
var items = list.Batch(parts, seq => seq.ToList()); // gives you IEnumerable<List<T>>
// etc...

Another example:

public class Program
{
    static void Main(string[] args)
    {
        var list = new List<int>();
        for (int i = 1; i < 10000; i++)
        {
            list.Add(i);
        }

        var batched = list.Batch(681);

        // will print 15. The 15th element has 465 items...
        Console.WriteLine(batched.Count().ToString());  
        Console.WriteLine(batched.ElementAt(14).Count().ToString());
        Console.WriteLine();
        Console.WriteLine("Press enter to exit.");
        Console.ReadLine();
    }
}

When I scanned the contents of the batches, the ordering was preserved.

code4life
In these examples, it seems to want "parts" to be the amount of items in each collection. In MY example, I want "parts" to be the number of sub collections to create.
Brian David Berman
+3  A: 

I had to make use of this to compare a list of objects to one another in groups of 4... it will keep the objects in the order that the original possessed. Could be expanded to do something other than 'List'

/// <summary>
/// Partition a list of elements into a smaller group of elements
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="list"></param>
/// <param name="totalPartitions"></param>
/// <returns></returns>
public static List<T>[] Partition<T>(List<T> list, int totalPartitions)
{
    if (list == null)
        throw new ArgumentNullException("list");

    if (totalPartitions < 1)
        throw new ArgumentOutOfRangeException("totalPartitions");

    List<T>[] partitions = new List<T>[totalPartitions];

    int maxSize = (int)Math.Ceiling(list.Count / (double)totalPartitions);
    int k = 0;

    for (int i = 0; i < partitions.Length; i++)
    {
        partitions[i] = new List<T>();
        for (int j = k; j < k + maxSize; j++)
        {
            if (j >= list.Count)
                break;
            partitions[i].Add(list[j]);
        }
        k += maxSize;
    }

    return partitions;
}
SomeMiscGuy
Thanks. Worked great.
Brian David Berman
A: 
    public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
    {
        int nGroups = (int)Math.Ceiling(list.Count() / (double)parts);

        var groups = Enumerable.Range(0, nGroups);

        return groups.Select(g => list.Skip(g * parts).Take(parts));
    }
Grozz