views:

36

answers:

1

Let's say you have two collections of integers:

IEnumerable<int> col1=new List<int> {2,3,3,5,7,11,11,11,13};
IEnumerable<int> col2=new List<int> {5,7,7,7,11,19};

Now I would like to create a third collection, col3, such that for every distinct element that appears in either col1 or col2, col3 will contain that element with at least as many occurances as the maximum number of occurrences of the element in either col1 or col2, but no more. Let me show the end result, then elaborate further:

   IEnumerable<int> col3=...;

The contents of col3 should be:

{2,3,3,5,7,7,7,11,11,11,13,19}

Another way to describe the contents of col3 as the result of this "pseudo-union" operation is that it should contain enough elements of each value, but no more, such that either of the two original collections could be formed individually (i.e., one at a time, from the entire domain of numbers in col3) by extracting elements from col3.

In case there is still confusion by what I mean when I say, "one at a time", imagine that col1 and col2 are instead collections of different types of marbles, with duplicates. I want to form col3 such that I have the minimum amount of the different types of marbles, so that I can remove enough marbles from col3 to form col1, then put the marbles back into col3, then remove enough marbles to form col2.

I would love it if the answer used LINQ to come up with a single expression that solves the problem, because the two methods I've thought of so far involve:

  1. Merging dictionaries which contain histograms (i.e., number of occurences of each different integer).
  2. Constructing a dictionary from col1 and then doing a "mark and insert" type operation while processing col2, resulting in a final dictionary with just enough count for each integer to satisfy my criteria.

Update: The input collections are guaranteed to be sorted (i.e., monotonic and non-decreasing) and the resulting collection should be in the same order.

+4  A: 

Like this:

var groups1 = col1.ToLookup(e => e);
var groups2 = col2.ToLookup(e => e);

var col3 = col1.Union(col2)
               .SelectMany(e => Enumerable.Repeat(e, 
                   Math.Max(
                            groups1[e].Count(), 
                            groups2[e].Count()
                   )
                ));
SLaks
Wow that was fast! Let me test this solution...
Michael Goldshteyn
You need four closing parens at the end of that expr...
Michael Goldshteyn
It seems to solve the problem. I wonder if there's any way to do this without the two groups intermediate vars.
Michael Goldshteyn
@Michael: Not without writing a loop. (perhaps in a custom extension method)
SLaks
Well I suppose you could replace groups1/groups2 with their formation code (i.e., col1.ToLookup(e => e)), but this would be inefficient since they would be recreated at each iteration of SelectMany
Michael Goldshteyn
@Michael: You could write `col1.Count(c => c == e)`, which would be slightly less inefficient.
SLaks
Yes, but it would also amount to a linear search in each collection for each count vs. hopefully a logarithmic (or constant time, if they're hashes) lookup in the groups1 dictionaries.
Michael Goldshteyn
I realize that. That's why I said _slightly_.
SLaks
... and I meant lookups, not dictionaries in my comment ;)
Michael Goldshteyn