views:

1510

answers:

5

Suppose I am given a sorted list of elements and I want to generate all subsets satisfying some condition, so that if a given set does not satisfy the condition, then a larger subset will also not satisfy it, and all sets of one element do satisfy it.

For example, given a list of all positive integers smaller than 100, determine subsets whose sum is smaller than 130: (100,29) (0,1,40), (0), etc...

How can I do that (preferably in Python)?

Thanks! :)

+1  A: 
Charlie Martin
Makes no sense to iterate over all sets; if the sum of (x,y) is bigger than 100, there's no need to check any remaining subset with x,y! (For example: (40,79,50) has a sum greater than 100, so theres' no need to check (40,79,50,1), (40,79,50,66,1) etc...
ooboo
But that's only one example of the possible conditions. What if the condition is that the sum is less than 5050 (which is the sum of 1 through 100)? Then *all* subsets will satisfy the condition.
Charlie Martin
But the fact that, in the worst scenario, you will get O(2^n) time complexity doesn't should stop us to search a good heuristic to solve the problem.
akappa
The roblem statement doesn't admit any general optimizations: the selection predicate is open. Your solution will work for the *example*.
Charlie Martin
+3  A: 

You can generate all the subsets using a Branch-and-bound technique: you can generate all the subsets in an incremental fashion (generating superset of subsets already determined), using as a prune condition "does not explore this branch of the tree if the root does not satify the constraint".

If you want to be generic regarding the constraint, I think this is the best strategy.

Be sure to write in a correct manner the code that generates the subsets, otherwise you generate many time the same subsets: in order to avoid memoization, which can be time-consuming due to map lookups and introduce memory overhead, you can generate the subsets in that manner:

GetAllSubsets(List objects) {
    List generated = {};
    GetAllSubsets(generated, [], objects);
    return generated;
}

GetAllSubsets(List subsetGenerated, List objectFixed, List objectsToFix) {
    GetAllSubsets(subsetGenerated, objectFixed, objectsToFix.sublist(1, objectsToFix.length());
    if (satisfy(toCheck = objectsFixed.add(objectsToFix.get(0)))) {
        subsetGenerated.add(toCheck);
        GetAllSubsets(subsetGenerated, toCheck, objectsToFix.sublist(1, objectsToFix.length());
    }
}

In fact, the subsets added by the first invocation of GetAllSubsets doesn't have the first element of objectsToFix, where the subsets added by the second call (if the pruning condition isn't violated) have that element, so the intersection of the two sets of subsets generated is empty.

akappa
+2  A: 

You could construct your sets recursively, starting with the empty set and attempting to add more elements, giving up on a recursive line of execution if one of the subsets (and thus all of its supersets) fails to meet the condition. Here's some pseudocode, assuming a set S whose condition-satisfying subsets you would like to list. For convenience, assume that the elements of S can be indexed as x(0), x(1), x(2), ...

EnumerateQualifyingSets(Set T)
{
    foreach (x in S with an index larger than the index of any element in T)
    {
            U = T union {x}

            if (U satisfies condition)
            {
                print U
                EnumerateQualifyingSets(U)
            }
    }
}

The first call would be with T as the empty set. Then, all the subsets of S matching the condition would be printed. This strategy relies crucially on the fact that a subset of S which does not meet the condition cannot be contained in one that does.

PeterAllenWebb
This pseudocode generates many time the same subsets: this extra-complexity could defy the optimization of the pruning.You should add memoization in order to fix the problem.
akappa
oops. you are right. i will correct after work. thanks.
PeterAllenWebb
ok. should be good now.
PeterAllenWebb
Yep, now it works well :)
akappa
+1  A: 

I've done something similar for a class schedule generating algorithm. Our schedule class had 2 elements - a list of courses added to the schedule, and a list of courses available to add.

Pseudocode:

queue.add(new schedule(null, available_courses))
while( queue is not empty )
    sched = queue.next()
    foreach class in sched.available_courses
        temp_sched = sched.copy()
        temp_sched.add(class)
        if(temp_sched.is_valid())
            results.add(temp_sched)
            queue.add(temp_sched)

The idea is to start with an empty schedule and the list of available classes, and to search down the tree for valid schedules (valid meaning fits the requirements given by the user, doesn't have time conflicts, etc). If a schedule is invalid, it is thrown away - we can't make an invalid schedule valid by adding classes (ie, pruning the tree).

It should be pretty easy to modify this to work with your problem.

baumgart
+1  A: 

Here's a concrete example of akappa's answer, using a recursive function to generate the subsets:

def restofsubsets(goodsubset, remainingels, condition):
    answers = []
    for j in range(len(remainingels)):
        nextsubset = goodsubset + remainingels[j:j+1]
        if condition(nextsubset):
            answers.append(nextsubset)
            answers += restofsubsets(nextsubset, remainingels[j+1:], condition)
    return answers

 #runs slowly
 easieranswer = restofsubsets([], range(101), lambda l:sum(l)<40)

 #runs much faster due to eliminating big numbers first
 fasteranswer = restofsubsets([], range(100,-1,-1), lambda l:sum(l)<40)

 #runs extremely slow even with big-numbers-first strategy
 finalanswer = restofsubsets([], range(100,-1,-1), lambda l:sum(l)<130)