Is there a nice way to split a collection into 'n' parts with LINQ ? Not necessarily even of course
EDIT: Okay, it looks like I misread the question. I read it as "pieces of length n" rather than "n pieces". Doh! Considering deleting answer...
(Original answer)
I don't believe there's a built-in way of partitioning, although I intend to write one in my set of additions to LINQ to Objects. Marc Gravell has an implementation here although I would probably modify it to return a read-only view:
public static IEnumerable<IEnumerable<T>> Partition<T>
(this IEnumerable<T> source, int size)
{
T[] array = null;
int count = 0;
foreach (T item in source)
{
if (array == null)
{
array = new T[size];
}
array[count] = item;
count++;
if (count == size)
{
yield return new ReadOnlyCollection<T>(array);
array = null;
count = 0;
}
}
if (array != null)
{
Array.Resize(ref array, count);
yield return new ReadOnlyCollection<T>(array);
}
}
Splitting a collection into n pieces is easy enough; there isn't an in-built LINQ extension method, but you can add your own. The only trick is - you need to know the length up front. So perhaps (by default) consider an extension method on something such as ICollection<T>
(which has the Count
). Then it is something like my previous answer:
using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
static class Program { // formatted for vertical space
static void Main() {
var items = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int grp = 0;
foreach (var group in items.Split(3)) {
Console.WriteLine("Group " + grp++);
foreach (var item in group) {
Console.WriteLine(item);
}
}
}
// fallback option
public static IEnumerable<IEnumerable<T>> Split<T>(
this IEnumerable<T> items, int count) {
return Split<T>(new List<T>(items), count);
}
// perferred option
public static IEnumerable<IEnumerable<T>> Split<T>(
this ICollection<T> items, int count) {
if (count <= 0) throw new ArgumentOutOfRangeException("count");
int size = items.Count / count;
if ((items.Count % count) != 0) size++;
T[] array = new T[size];
int index = 0;
foreach (T item in items) {
array[index++] = item;
if (index == size) {
yield return new ReadOnlyCollection<T>(array);
index = 0;
}
}
if (array != null) {
Array.Resize(ref array, index);
yield return new ReadOnlyCollection<T>(array);
}
}
}
OK, so this is purely using LINQ, haven't fully tested it but it looks ok and it compiles. Similar to Marc Gravell's but implemented differently.
public static IEnumerable<IEnumerable<T>> Partition<T>
(this IEnumerable<T> query, int partitions)
{
// make a single query to get all records
var all = query.ToList();
// work out how many to take each time
var take = items.Count / partitions;
if ((items.Count % take) != 0) take++;
for(var skip = 0; skip < all.Count; skip += take)
{
yield return all.Skip(skip).Take(take);
}
}
Example usage:
var query = from entity in Context.Entities
select entity;
return query.Partition(10);
If order in these parts is not very important you can try this:
int[] array = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int n = 3;
var result =
array.Select((value, index) => new { Value = value, Index = index }).GroupBy(i => i.Index % n, i => i.Value);
// or
var result2 =
from i in array.Select((value, index) => new { Value = value, Index = index })
group i.Value by i.Index % n into g
select g;
However these can't be cast to IEnumerable<IEnumerable<int>> by some reason...
A pure linq and the simplest solution is as under.
static class LinqExtensions
{
public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> list, int parts)
{
int i = 0;
var splits = from name in list
group name by i++ % parts into part
select part.AsEnumerable();
return splits;
}
}
int[] items = new int[] { 0,1,2,3,4,5,6,7,8,9, 10 };
int itemIndex = 0;
int groupSize = 2;
int nextGroup = groupSize;
var seqItems = from aItem in items
group aItem by
(itemIndex++ < nextGroup)
?
nextGroup / groupSize
:
(nextGroup += groupSize) / groupSize
into itemGroup
select itemGroup.AsEnumerable();
This is my code, nice and short.
<Extension()> Public Function Chunk(Of T)(ByVal this As IList(Of T), ByVal size As Integer) As List(Of List(Of T))
Dim result As New List(Of List(Of T))
For i = 0 To CInt(Math.Ceiling(this.Count / size)) - 1
result.Add(New List(Of T)(this.GetRange(i * size, Math.Min(size, this.Count - (i * size)))))
Next
Return result
End Function
hi,
i use this:
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> instance, int partitionSize)
{
return instance
.Select((value, index) => new { Index = index, Value = value })
.GroupBy(i => i.Index / partitionSize)
.Select(i => i.Select(i2 => i2.Value))
;
}
greets! Elmer
I have been using the Partition function I posted earlier quite often. The only bad thing about it was that is wasn't completely streaming. This is not a problem if you work with few elements in your sequence. I needed a new solution when i started working with 100.000+ elements in my sequence.
The following solution is a lot more complex (and more code!), but it is very efficient.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Collections;
namespace LuvDaSun.Linq
{
public static class EnumerableExtensions
{
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> enumerable, int partitionSize)
{
/*
return enumerable
.Select((item, index) => new { Item = item, Index = index, })
.GroupBy(item => item.Index / partitionSize)
.Select(group => group.Select(item => item.Item) )
;
*/
return new PartitioningEnumerable<T>(enumerable, partitionSize);
}
}
class PartitioningEnumerable<T> : IEnumerable<IEnumerable<T>>
{
IEnumerable<T> _enumerable;
int _partitionSize;
public PartitioningEnumerable(IEnumerable<T> enumerable, int partitionSize)
{
_enumerable = enumerable;
_partitionSize = partitionSize;
}
public IEnumerator<IEnumerable<T>> GetEnumerator()
{
return new PartitioningEnumerator<T>(_enumerable.GetEnumerator(), _partitionSize);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class PartitioningEnumerator<T> : IEnumerator<IEnumerable<T>>
{
IEnumerator<T> _enumerator;
int _partitionSize;
public PartitioningEnumerator(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public void Dispose()
{
_enumerator.Dispose();
}
IEnumerable<T> _current;
public IEnumerable<T> Current
{
get { return _current; }
}
object IEnumerator.Current
{
get { return _current; }
}
public void Reset()
{
_current = null;
_enumerator.Reset();
}
public bool MoveNext()
{
bool result;
if (_enumerator.MoveNext())
{
_current = new PartitionEnumerable<T>(_enumerator, _partitionSize);
result = true;
}
else
{
_current = null;
result = false;
}
return result;
}
}
class PartitionEnumerable<T> : IEnumerable<T>
{
IEnumerator<T> _enumerator;
int _partitionSize;
public PartitionEnumerable(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public IEnumerator<T> GetEnumerator()
{
return new PartitionEnumerator<T>(_enumerator, _partitionSize);
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
class PartitionEnumerator<T> : IEnumerator<T>
{
IEnumerator<T> _enumerator;
int _partitionSize;
int _count;
public PartitionEnumerator(IEnumerator<T> enumerator, int partitionSize)
{
_enumerator = enumerator;
_partitionSize = partitionSize;
}
public void Dispose()
{
}
public T Current
{
get { return _enumerator.Current; }
}
object IEnumerator.Current
{
get { return _enumerator.Current; }
}
public void Reset()
{
if (_count > 0) throw new InvalidOperationException();
}
public bool MoveNext()
{
bool result;
if (_count < _partitionSize)
{
if (_count > 0)
{
result = _enumerator.MoveNext();
}
else
{
result = true;
}
_count++;
}
else
{
result = false;
}
return result;
}
}
}
Enjoy!
Ok, I'll throw my hat in the ring. The advantages of my algorithm:
- No expensive division or modulus operators
- All operations are O(1)
- Works for IEnumerable<> source (no Count property needed)
- Simple
The code:
public static IEnumerable<IEnumerable<T>>
Section<T>(this IEnumerable<T> source, int length)
{
if (length <= 0)
throw new ArgumentOutOfRangeException("length");
var section = new LinkedList<T>();
foreach (var item in source)
{
section.AddLast(item);
if (section.Count == length)
{
yield return section.AsReadOnly();
section = new LinkedList<T>();
}
}
if (section.Count > 0)
yield return section.AsReadOnly();
}
static IEnumerable<T> AsReadOnly<T>(this IEnumerable<T> source)
{
foreach (var item in source)
yield return item;
}