views:

86

answers:

1

I'm writing a Digital Fountain system in C#. Part of this system creates me sets of integers, I need to find the combinations of sets which create can leave me with a set of just one item. What's the fastest way to do this?

Set A: 1,2,3,4,5,6
Set B: 1,2,3,4,6
Set C: 1,2,3
Set D: 5,6

Solutions:
A - B => 5
A - (C + D) => 4

I don't need to find all combinations, just enough to find me as many unique numbers as possible. This may be possible to exploit to create a more efficient algorithm.

An important point that I forgot to mention: I do not know, beforehand, how many sets there are, instead I add them one by one, and must determine each time if I have found every number I require. So the algorithm must be something which can be run in stages as new sets are added.

Nb. Solutions in C# get bonus marks ;)

+1  A: 

i think some nice solutions can be gained by some sort of modification of using greedy set cover (http://en.wikipedia.org/wiki/Set_cover_problem) algorithm.

[pseudocode] so:

1. sort sets by size descending
2.
foreach set in sets do:
  uncovered = set.size
  while uncovered > 1
    current_set = the biggest set that covers no more than (uncovered - 1) and was not used before to cover set
    uncovered = uncovered - covered_by_set(set)
    collect current_set to some array
  end
end

edit:

  • you can ommit foreach loop for last set
  • this will bring you no more than one solution for each of sets (to fix this you can change problem directly into set cover problem and use greedy set cover), for example if you array [1,3,4], you need to find solution of SCV problem for all subsets of it that have size = 2: [1,3], [1,4], [3,4]. it will make problem much more complex
  • another way that you may consider are evolution algorithms (representation here will be very simple, treat specified number as bit, fitness function should grow closer to 1), but this still don't solve problem of adding new set after calculations (maybe when you have best population from last problem, then after adding new set just add new place in chromosome)
dfens
That's a nice looking solution. But how well will it work with the constraint of constantly adding new sets (which I forgot to mention previously, sorry)?
Martin
this will be harder, you can do one step only for new set for new set (checking sets with smaller sizes) or do it for added set and each greater set (this will work slow if you will be adding sets often)
dfens
Just a note, I'm not ignoring this answer. I'll upvote sometime soon, I'm busy trying to implement a variation of this and seeing how fast it is.
Martin
The final algorithm I used to solve the problem eliminated the need for this altogether. Good answer though, thanks :)
Martin