tags:

views:

239

answers:

3

Hi,

I have a list of duplicate numbers:

Enumerable.Range(1,3).Select(o => Enumerable.Repeat(o, 3)).SelectMany(o => o)
// {1,1,1,2,2,2,3,3,3}

I group them and get quantity of occurance:

Enumerable.Range(1,3).Select(o => Enumerable.Repeat(o, 3)).SelectMany(o => o)
    .GroupBy(o => o).Select(o => new { Qty = o.Count(), Num = o.Key })

Qty   Num
3     1
3     2
3     3

What I really need is to limit the quantity per group to some number. If the limit is 2 the result for the above grouping would be:

Qty   Num
2     1
1     1
2     2
1     2
2     3
1     3

So, if Qty = 10 and limit is 4, the result is 3 rows (4, 4, 2). The Qty of each number is not equal like in example. The specified Qty limit is the same for whole list (doesn't differ based on number).

Thanks

+1  A: 

There was a similar question that came up recently asking how to do this in SQL - there's no really elegant solution and unless this is Linq to SQL or Entity Framework (i.e. being translated into a SQL query), I'd really suggest that you not try to solve this problem with Linq and instead write an iterative solution; it's going to be a great deal more efficient and easier to maintain.

That said, if you absolutely must use a set-based ("Linq") method, this is one way you could do it:

var grouped =
    from n in nums
    group n by n into g
    select new { Num = g.Key, Qty = g.Count() };

int maxPerGroup = 2;
var portioned =
    from x in grouped
    from i in Enumerable.Range(1, grouped.Max(g => g.Qty))
    where (x.Qty % maxPerGroup) == (i % maxPerGroup)
    let tempQty = (x.Qty / maxPerGroup) == (i / maxPerGroup) ? 
        (x.Qty % maxPerGroup) : maxPerGroup
    select new
    {
        Num = x.Num,
        Qty = (tempQty > 0) ? tempQty : maxPerGroup
    };

Compare with the simpler and faster iterative version:

foreach (var g in grouped)
{
    int remaining = g.Qty;
    while (remaining > 0)
    {
        int allotted = Math.Min(remaining, maxPerGroup);
        yield return new MyGroup(g.Num, allotted);
        remaining -= allotted;
    }
}
Aaronaught
You're right about the LINQ method being too complex. Thanks.
JKJKJK
A: 

Aaronaught's excellent answer doesn't cover the possibility of getting the best of both worlds... using an extension method to provide an iterative solution.

Untested:

public static IEnumerable<IEnumerable<U>> SplitByMax<T, U>(
  this IEnumerable<T> source,
  int max,
  Func<T, int> maxSelector,
  Func<T, int, U> resultSelector
)
{
  foreach(T x in source)
  {
    int number = maxSelector(x);
    List<U> result = new List<U>();
    do
    {
      int allotted = Math.Min(number, max); 
      result.Add(resultSelector(x, allotted));
      number -= allotted
    } while (number > 0 && max > 0);

    yield return result;
  }
}

Called by:

var query = grouped.SplitByMax(
  10,
  o => o.Qty,
  (o, i) => new {Num = o.Num, Qty = i}
)
.SelectMany(split => split);
David B
+1  A: 

Some of the other answers are making the LINQ query far more complex than it needs to be. Using a foreach loop is certainly faster and more efficient, but the LINQ alternative is still fairly straightforward.

var input = Enumerable.Range(1, 3).SelectMany(x => Enumerable.Repeat(x, 10));
int limit = 4;

var query =
    input.GroupBy(x => x)
         .SelectMany(g => g.Select((x, i) => new { Val = x, Grp = i / limit }))
         .GroupBy(x => x, x => x.Val)
         .Select(g => new { Qty = g.Count(), Num = g.Key.Val });
LukeH