views:

207

answers:

3

Is there a way to generate all of the subset sums s1, s2, ..., sk that fall in a range [A,B] faster than O((k+N)*2N/2), where k is the number of sums there are in [A,B]? Note that k is only known after we have enumerated all subset sums within [A,B].

I'm currently using a modified Horowitz-Sahni algorithm. For example, I first call it to for the smallest sum greater than or equal to A, giving me s1. Then I call it again for the next smallest sum greater than s1, giving me s2. Repeat this until we find a sum sk+1 greater than B. There is a lot of computation repeated between each iteration, even without rebuilding the initial two 2N/2 lists, so is there a way to do better?

In my problem, N is about 15, and the magnitude of the numbers is on the order of millions, so I haven't considered the dynamic programming route.

+3  A: 

Check the subset sum on Wikipedia. As far as I know, it's the fastest known algorithm, which operates in O(2^(N/2)) time.

Edit: If you're looking for multiple possible sums, instead of just 0, you can save the end arrays and just iterate through them again (which is roughly an O(2^(n/2) operation) and save re-computing them. The value of all the possible subsets is doesn't change with the target.

Edit again: I'm not wholly sure what you want. Are we running K searches for one independent value each, or looking for any subset that has a value in a specific range that is K wide? Or are you trying to approximate the second by using the first?

Edit in response: Yes, you do get a lot of duplicate work even without rebuilding the list. But if you don't rebuild the list, that's not O(k * N * 2^(N/2)). Building the list is O(N * 2^(N/2)).

If you know A and B right now, you could begin iteration, and then simply not stop when you find the right answer (the bottom bound), but keep going until it goes out of range. That should be roughly the same as solving subset sum for just one solution, involving only +k more ops, and when you're done, you can ditch the list.

More edit: You have a range of sums, from A to B. First, you solve subset sum problem for A. Then, you just keep iterating and storing the results, until you find the solution for B, at which point you stop. Now you have every sum between A and B in a single run, and it will only cost you one subset sum problem solve plus K operations for K values in the range A to B, which is linear and nice and fast.

 s = *i + *j; if s > B then ++i; else if s < A then ++j; else { print s; ... what_goes_here? ... }

No, no, no. I get the source of your confusion now (I misread something), but it's still not as complex as what you had originally. If you want to find ALL combinations within the range, instead of one, you will just have to iterate over all combinations of both lists, which isn't too bad.

Excuse my use of auto. C++0x compiler.

std::vector<int> sums;
std::vector<int> firstlist;
std::vector<int> secondlist;
// Fill in first/secondlist.
std::sort(firstlist.begin(), firstlist.end());
std::sort(secondlist.begin(), secondlist.end());
auto firstit = firstlist.begin();
auto secondit = secondlist.begin();
// Since we want all in a range, rather than just the first, we need to check all combinations. Horowitz/Sahni is only designed to find one.
for(; firstit != firstlist.end(); firstit++) {
    for(; secondit = secondlist.end(); secondit++) {
        int sum = *firstit + *secondit;
        if (sum > A && sum < B)
            sums.push_back(sum);
    }
}

It's still not great. But it could be optimized if you know in advance that N is very large, for example, mapping or hashmapping sums to iterators, so that any given firstit can find any suitable partners in secondit, reducing the running time.

DeadMG
Thanks for such a quick reply! I am using that algorithm described on Wikipedia in the inner loop of my algorithm. But since I'm looking for a _range_ of subset sum values instead of just a single one, I'm wondering if there were some optimization for the new twist.
Eric
If you were to save the end arrays, you could iterate through them looking for any arbitrary end point without needing to recompute them.
DeadMG
I've edited my question hopefully to clarify it with respect to your questions. Is my problem description still ambiguous?
Eric
@Eric: Responded.
DeadMG
I truly appreciate your attention to this. I thought this is a counterexample to your algorithm's correctness (more likely I just misunderstand something)... Per your algorithm, we keep the best sum s_best that is greater than our lower bound A. As it runs, it tests subset sums both >= s_best and <= s_best but we only update s_best when a sum is in (A,s_best). Sometimes though, a generated sum s_ignored will be in (s_best, B] but we have ignored it! Since we are iterating monotonically through the two 2^(N/2) lists, we will never again see s_ignored and will be missing a solution.
Eric
I think if we were to keep a set of all sums found withing [A,B] during the run of your algorithm, there are still branch points(?) Where did my reasoning take a wrong turn?
Eric
@Eric: There is no s_best. You just take every sum you find between the time that you find the solution for A, and the time that you find the solution for B. Assuming that you iterate in the correct direction such that A occurs before B, then you are guaranteed to receive every subset that has a sum in the range A, B.
DeadMG
This is a little hard for me to grasp. Would you be willing to provide pseudocode to help me understand this? For lists L1 and L2 in ascending order, initializing i=L1.rbegin() and j=L2.begin(), we would have something like: s = *i + *j; if s > B then ++i; else if s < A then ++j; else { print s; ... what_goes_here? ... }
Eric
Based on your description is it correct to say function what_goes_here is "for(m = j; *i + *m <= B; ++m;) print *i + *m;"... but this adds k^2 operations so I don't know if this is what you meant?
Eric
@DeadMG: Iterating over all combinations of both lists is O(2^N), you might as well generate all possible subsets instead of doing the N/2-N/2 split!
Moron
@Moron: Except that when he generates the lists, he won't have to recompute them. In addition, you're right- but since he wants to find every sum, not just one specific one or one subset with one specific value, then he is going to have to enumerate every possible sum.
DeadMG
@DeadMG. What I meant was, since you consider all possible combinations making it O(2^N), you might as well keep it simple by just generating all possibilities, thus saving some time and code complexity.
Moron
@Moron: You're probably right.
DeadMG
Doing the N/2-N/2 split and running the Horowitz algorithm k times after the sort only requires N*2^(N/2) + k*2^(N/2) = (k+N)*2^(N/2) time, where the first term of the sum is the sort and the second term is performing k searches. I'm pretty sure that the way I'm doing it now is (k+N)*2^(N/2) and not 2^N.
Eric
@Eric: Who was that comment addressed to? What DeadMG is doing at the end is doing a nested loop with the two sorted arrays of 2^(N/2) size each, making it O(2^N). As to the time complexity, it is possible to do in O(K + N*2^(N/2)). You can in fact do it in O(N*2^(N/2)) if you are willing to deal with a different representation of the possible candidate subsets. See my answer.
Moron
+2  A: 

It is possible to do this in O(N*2^(N/2)), using ideas similar to Horowitz Sahni, but we try and do some optimizations to reduce the constants in the BigOh.

We do the following

  • Step 1: Split into sets of N/2, and generate all possible 2^(N/2) sets for each split. Call them S1 and S2. This we can do in O(2^(N/2)) (note: the N factor is missing here, due to an optimization we can do).

  • Step 2: Next sort the larger of S1 and S2 (say S1) in O(N*2^(N/2)) time (we optimize here by not sorting both).

  • Step 3: Find Subset sums in range [A,B] in S1 using binary search (as it is sorted).

  • Step 4: Next, for each sum in S2, find using binary search the sets in S1 whose union with this gives sum in range [A,B]. This is O(N*2^(N/2)). At the same time, find if that corresponding set in S2 is in the range [A,B]. The optimization here is to combine loops. Note: This gives you a representation of the sets (in terms of two indexes in S2), not the sets themselves. If you want all the sets, this becomes O(K + N*2^(N/2)), where K is the number of sets.

Further optimizations might be possible, for instance when sum from S2, is negative, we don't consider sums < A etc.

Since Steps 2,3,4 should be pretty clear, I will elaborate further on how to get Step 1 done in O(2^(N/2)) time.

For this, we use the concept of Gray Codes. Gray codes are a sequence of binary bit patterns in which each pattern differs from the previous pattern in exactly one bit. Example: 00 -> 01 -> 11 -> 10 is a gray code with 2 bits.

There are gray codes which go through all possible N/2 bit numbers and these can be generated iteratively (see the wiki page I linked to), in O(1) time for each step (total O(2^(N/2)) steps), given the previous bit pattern, i.e. given current bit pattern, we can generate the next bit pattern in O(1) time.

This enables us to form all the subset sums, by using the previous sum and changing that by just adding or subtracting one number (corresponding to the differing bit position) to get the next sum.

Moron
+1  A: 

If you modify the Horowitz-Sahni algorithm in the right way, then it's hardly slower than original Horowitz-Sahni. Recall that Horowitz-Sahni works two lists of subset sums: Sums of subsets in the left half of the original list, and sums of subsets in the right half. Call these two lists of sums L and R. To obtain subsets that sum to some fixed value A, you can sort R, and then look up a number in R that matches each number in L using a binary search. However, the algorithm is asymmetric only to save a constant factor in space and time. It's a good idea for this problem to sort both L and R.

In my code below I also reverse L. Then you can keep two pointers into R, updated for each entry in L: A pointer to the last entry in R that's too low, and a pointer to the first entry in R that's too high. When you advance to the next entry in L, each pointer might either move forward or stay put, but they won't have to move backwards. Thus, the second stage of the Horowitz-Sahni algorithm only takes linear time in the data generated in the first stage, plus linear time in the length of the output. Up to a constant factor, you can't do better than that (once you have committed to this meet-in-the-middle algorithm).

Here is a Python code with example input:

# Input
terms = [29371, 108810, 124019, 267363, 298330, 368607,
    438140, 453243, 515250, 575143, 695146, 840979, 868052, 999760]
(A,B) = (500000,600000)

# Subset iterator stolen from Sage
def subsets(X):
    yield []; pairs = []
    for x in X:
        pairs.append((2**len(pairs),x))
        for w in xrange(2**(len(pairs)-1), 2**(len(pairs))):
            yield [x for m, x in pairs if m & w]

# Modified Horowitz-Sahni with toolow and toohigh indices
L = sorted([(sum(S),S) for S in subsets(terms[:len(terms)/2])])
R = sorted([(sum(S),S) for S in subsets(terms[len(terms)/2:])])
(toolow,toohigh) = (-1,0)
for (Lsum,S) in reversed(L):
    while R[toolow+1][0] < A-Lsum and toolow < len(R)-1: toolow += 1
    while R[toohigh][0] <= B-Lsum and toohigh < len(R): toohigh += 1
    for n in xrange(toolow+1,toohigh):
        print '+'.join(map(str,S+R[n][1])),'=',sum(S+R[n][1])

"Moron" (I think he should change his user name) raises the reasonable issue of optimizing the algorithm a little further by skipping one of the sorts. Actually, because each list L and R is a list of sizes of subsets, you can do a combined generate and sort of each one in linear time! (That is, linear in the lengths of the lists.) L is the union of two lists of sums, those that include the first term, term[0], and those that don't. So actually you should just make one of these halves in sorted form, add a constant, and then do a merge of the two sorted lists. If you apply this idea recursively, you save a logarithmic factor in the time to make a sorted L, i.e., a factor of N in the original variable of the problem. This gives a good reason to sort both lists as you generate them. If you only sort one list, you have some binary searches that could reintroduce that factor of N; at best you have to optimize them somehow.

At first glance, a factor of O(N) could still be there for a different reason: If you want not just the subset sum, but the subset that makes the sum, then it looks like O(N) time and space to store each subset in L and in R. However, there is a data-sharing trick that also gets rid of that factor of O(N). The first step of the trick is to store each subset of the left or right half as a linked list of bits (1 if a term is included, 0 if it is not included). Then, when the list L is doubled in size as in the previous paragraph, the two linked lists for a subset and its partner can be shared, except at the head:

     0
     |
     v
1 -> 1 -> 0 -> ...

Actually, this linked list trick is an artifact of the cost model and never truly helpful. Because, in order to have pointers in a RAM architecture with O(1) cost, you have to define data words with O(log(memory)) bits. But if you have data words of this size, you might as well store each word as a single bit vector rather than with this pointer structure. I.e., if you need less than a gigaword of memory, then you can store each subset in a 32-bit word. If you need more than a gigaword, then you have a 64-bit architecture or an emulation of it (or maybe 48 bits), and you can still store each subset in one word. If you patch the RAM cost model to take account of word size, then this factor of N was never really there anyway.

So, interestingly, the time complexity for the original Horowitz-Sahni algorithm isn't O(N*2^(N/2)), it's O(2^(N/2)). Likewise the time complexity for this problem is O(K+2^(N/2)), where K is the length of the output.

Greg Kuperberg
The reason why I suggested not to sort both: sorting both + linear time is c*R*logR + c*L*logL + L + R. Sorting just one and doing binary search is c*R*logR + c'*L*logR, which for c > c' > 1, if my calculations are correct, will work out smaller. And more optimizations are possible by cutting down the area to binary search based on previous min and max values seen etc.
Moron
See the extended answer.
Greg Kuperberg
+1. Interesting! When the wiki said Horowitz-Sahni was the best found so far, I never stopped to think of getting rid of log factor. Perhaps you should elaborate on the linked list sharing and add it here/to the wiki/publish it somewhere.
Moron
Per your suggestion, I added an explanation of the data storage issue. If I may also make a suggestion, can you change your screen name in SO?
Greg Kuperberg
Unfortunately the name has stuck and past conversations and answers which refer to this name will only confuse people who stumble upon them. Thank you for the suggestion, though.
Moron