views:

543

answers:

11

I have been asked to program a routine to decide the number of possible combinations in a product configurator.

The configurator is really simple. Even though it has more features than this, it can be modeled as several "radio groups" (like the UI control) where one of n options has to be selected.

The only kind of constraints that can be used is rules that says if one option is selected another option can not be selected.

So what I want to do is to calculate the number of different products that can be configured, given a set of option groups and constraints.

I made a naive approach to solve this using the Inclusion-exclusion principle. However as far as I can see, any algorithm based on this method should run in O(2^n) which won't work. There are of course several possible optimizations that should give decent runtimes but there are still easily constructed worst case scenarios.

Thats pretty much where I am right now. Any suggestions?

Update

I realize I haven't explained how the rules applies well enough.

There are several groups with options. One and only one option must be selected in each group. There can be one or more of options in a group.

There is only one type of constraints. If option A in some group is selected, then option B in some other group can't be selected. There can be any number of constraints, there is no limit how many constraints/rules apply to a option group or an option itself.

So an example would be:

Group 1:
x1 x2 x3 x4 x5

Group 2:
y1 y2 y3

Group 3:
z1 z2 z3 z4

Constraints:
x1 <-> y2 *
x1 <-> z4
y2 <-> z2

* If option x1 is selected in group1, then option y2 in group 2 can not be selected.

Using inclusion-exclusion I would calculate the number of combinations as

Combinations = Cno rules - Cr[1] - Cr[2] - Cr[3] + Cr[1,2] + Cr[1,3] + Cr[2,3] - Cr[1,2,3]

Where

Cno rules = 5 * 3 * 4

Cr[a,b,c] = Number of combinations that violates rule a, b and c.

The method unfortunately requires 2^|rules| calculations.

A: 

Won't it just be x^n, where n is the number of options and x is the number of choices per option?

samoz
You're essentially right, but you assume that each option has the same number of choices, which is probably not the case
Draemon
+2  A: 

If you have N option groups, each with Xi options (0<=i<N) then

X0*X1*...*X(N-1)

Gives you the answer you want. In other words, multiply the size of each group of options.

Draemon
What about the mentioned constraints?
AakashM
The OP isn't clear about what those constraints are. I had assumed he meant a fixed rule dictated that "only one option in a group can be selected", but if he meant something else he needs to clarify his question.
Draemon
I've updated my original post with more details about the constraints.
Daniel
+1  A: 

If you have n parameters with Ci possible values for each parameter and m constraints, a upper bound for the number of configuartions is the following (ignoring the constraints).

N0 = C1 * C2 * ... * Cn

A single constraint of the form ci == x => cj != y disallows the following number of configurations.

        N
Dk = -------
     Ci * Cj

Hence the number of configuration is obtained by subtracting the disallowed configuartions from the upper bound ignoring the constraints.

N = prod(Ci, i = 1...n) * (1 - sum(1 / (Cxj * Cyj), j = 1...m))

Here xj and yj are the both parameter indices from the j-th constraint.

Example

Parameters    n = 4

Parameter 1   C1 = 4   0 1 2 3 
Parameter 2   C2 = 3   4 5 6 
Parameter 3   C3 = 2   X Y
Parameter 4   C4 = 3   7 8 9

Constraints   m = 2

Constraint 1  c2 == 4 => c3 != X
Constraint 2  c3 == X => c4 != 9

N0 = 4 * 3 * 2 * 3 = 72

D1 = 72 / (3 * 2) = 12
D2 = 72 / (2 * 3) = 12

N = 72 - (12 + 12) = 48

UPDATE

I think this is not yet completly correct because it does not account for dependencies of the constraints.

Daniel Brückner
The problem is that when you have several rules, you can't just subtract the Dk value for each rule, because then you will subtract more configurations then those that actually violates the rules.
Daniel
@Daniel is correct, but at least this gives a lower bound on the number of combinations in O(n).
Zac Thompson
+1  A: 

There are no shortcuts here for the general case. It's not as bad as you think. See "Rethinking" below.

Is 2^n really so bad? How many exclusion rules are we talking about here? You really only have to do this once for each configurator, unless the set of rules/options is constantly changing on the fly and requiring dynamic recalculation. If there's really a huge number of rules, then I would not seek an exact solution -- only consider the kth-order intersections and say "number of combinations is at least/most ...". There might be other sieve methods that will let you quickly derive bounds on the answer.

Also keep in mind: if you only consider the exclusions you actually need, then 2^n is merely an upper bound, and your actual number of calculations may be significantly less for any actual scenarios. That is, if C[1,2] is zero, then so is C[1,2,...]. Consider this: for each constraint, "clump" sets of constraints together if they share any options in common. It's clear that your real running time is going to be defined by the size of the largest "clump" (which, yes, may be as big as n).


Rethinking: C[x,y] is going to be zero in most cases. A constraint can only overlap with other constraints involving a different group. In other words, (x1<->y1) can only overlap with (x1<->z1) or (y1<->z2) or something, not (x1<->y2). Similarly, a set of constraints can only overlap with a new group: the combination of (x1<->y1) with (y1<->z2) has no interaction with (x3<->z2) (the x group is already fixed at x1). You only have to consider inclusion/exclusion where each rule you add to the combination adds a previously-untouched group to the mix. So you are actually O(2G), where G is the number of groups (and also perhaps a different bound based on the size of the groups). Much more manageable!

Zac Thompson
lol... I'm working on a configurator at the moment where the n > 90... 2^90 is quite big ;-p
Marc Gravell
More than 90 option groups, or constraints? You are bound by the number of groups, not constraints. How many groups are there in the worst case?
Zac Thompson
Regardless, if you are dealing with 2^90, then you'd better stop looking for a precise answer and start looking for approximations, or lower and upper bounds.
Zac Thompson
A: 

I think Zac is thinking in the right direction. Looking at your expression for the number of combinations, you see that the second order terms Cr[i,j] are much smaller than the C[k] terms. Imagine a cube where each axis is a group of options. Each point in the cube represents a particular combination of options. A first order C[k] correction excludes a line of options between two surfaces of the cube. A second order correction C[i,j] only happens when two such lines meet at one point (combination of options) in space in the cube. So regardless of the number of groups you have, the higher order corrections are always increasingly smaller. If you stick to

Combinations = C(no rules) - Cr[1] - Cr[2] - Cr[3]

you end up with a lower bound for the number of combinations. Now that you know the size of your first order correction, and thinking about the observation with the cube above, you can even estimate the order of magnitude of the second order correction. It will depend on the number of groups. Your algorithm can then decide whether it wants to continue with the higher orders or stop.

Matt
+1  A: 

EDIT

This algorithm is incorrect. I have presented an alternate answer in another post which is still 2^N in the worse case, but might give better results otherwise.

This one works in the example choosen because y2 is part of the exclusion set of x1, and the two first constrains are based on x1. Still, I now see what needs to be done. It's still close to 2^N, but there are optimizations which can lead to significant gains.

To fix this algorithm, the composed rules must be of the form set(ox) <-> set(oy). To compose them, for each constrain with left-hand oX you compose, you also make other compositions of it with each rule you already composed, if oX is not part of the right-hand side of the composed rule, nor it's group is the same as the left-hand side of the group.

For completely independent constrains, this is 2^N. Otherwise, you are reducing N by doing:

  • unifying constrains with a common left-hand
  • not computing rule combinations which are mutually exclusive, this is divided in:
    • not combining rules for options in the same group
    • not combining rules where the left side of of one appears on the right side of the other

I don't think fixing this algorithm is worth it. It is rather memory-heavy, and it would have the very same order as my alternate answer, which is much lighter.

End Edit

Let's turn this around. How about this for an algorithm:

  1. Fix rules as necessary to ensure that for rule o1 <-> o2, group(o1) < group(o2)
  2. Compute "composed" rules by folding all rules oX <-> o? into a single rule oX <-> Set(o?)
  3. Compute a "clean" set of groups by removing from them the left option of every rule
  4. Compute alternate sets from the clean set, one for each composed rule, by replacing the group of the left option with the left option itself, and subtracting from the other groups the options of the right hand of the rule.
  5. For each set of groups, compute the number of combinations by multiplying the number of options in each group in the set.
  6. Add all the results from step 5.

Let's see this at work:

Group 1:
x1 x2 x3 x4 x5

Group 2:
y1 y2 y3

Group 3:
z1 z2 z3 z4

Constraints (already fixed):
x1 <-> y2 *
x1 <-> z4
y2 <-> z2

Composed rules:
x1 <-> (y2, z4)
y2 <-> (z2)

Clean set of groups:
x2x3x4x5, y1y3, z1z2z3z4

Alternate sets:
x1, y1y3, z1z2z3 (first composed rule)
x2x3x4x5, y2, z1z3z4 (second composed rule)

Totals:
4 * 2 * 4 = 32
1 * 2 * 3 = 6
4 * 1 * 3 = 12

Total: 50

Now, maybe this algorithm is incorrect. Right now, I can't think clearly enough to prove it correct or otherwise -- I have been too close to the problem for too long. But let's check against the example:

c(no rules) = 60
c1 => 4
c2 => 3
c3 => 5
c12 => 1
c13 => 1
c23 => 0
c123 => 0

c(no rules) - c1 - c2 - c3 + c12 + c13 + c23 - c123 = 50

If my algorithm is correct, it seems to be polynomial. Again, right now I can't think clearly enough, and I'd need to consider the big-O of the manipulation in the sets.

Here is an Scala implementation of it:

case class GroupOption(id: Int, option: Int)
case class Group(id: Int, options: Set[Int])
case class Rule(op1: GroupOption, op2: GroupOption)
case class ComposedRule(op: GroupOption, set: Set[GroupOption])

object ComputeCombinations {
  def fixRules(rules: Set[Rule]) = {
    rules map (rule => if (rule.op1.id > rule.op2.id) Rule(rule.op2, rule.op1) else rule)
  }

  def ruledOptions(id: Int, rules: Set[Rule]): Set[Int] = (
    rules 
    filter (rule => rule.op1.id == id)
    map (rule => rule.op1.option)
  )

  def cleanseSet(groups: Set[Group], rules: Set[Rule]) = {
    groups map (group => 
      Group(group.id, group.options -- ruledOptions(group.id, rules)))
  }

  def composeRules(rules: Set[Rule]): Set[ComposedRule] = Set(
    (
      rules.toList
      sort (_.op1.id < _.op1.id)
      foldLeft (List[ComposedRule]())
      ) { (list, rule) => list match {
        case ComposedRule(option, set) :: tail if option == rule.op1 =>
          ComposedRule(option, set + rule.op2) :: tail
        case _ => ComposedRule(rule.op1, Set(rule.op2)) :: list
      }} : _*
  )

  def subset(groups: Set[Group], composedRule: ComposedRule) = (
    groups
    filter (_.id != composedRule.op.id)
    map (group => Group(group.id, group.options -- 
                                  (composedRule.set 
                                   filter (_.id == group.id)
                                   map (_.option)
                                  )))
  )

  def subsets(groups: Set[Group], composedRules: Set[ComposedRule]) = (
    composedRules map (composedRule => subset(groups, composedRule))
  )

  def combinations(groups: Set[Group]) = (
    groups.toList map (_.options.size) reduceLeft (_*_)
  )

  def allCombinations(groups: Set[Group], rules: Set[Rule]) = {
    val fixedRules = fixRules(rules)
    val composedRules = composeRules(fixedRules)
    val cleanSet = cleanseSet(groups, fixedRules)
    val otherSets = subsets(cleanSet, composedRules)
    val allSets = otherSets + cleanSet
    val totalCombinations = allSets.toList map (set => combinations(set)) reduceLeft (_+_)
    totalCombinations
  }
}

object TestCombinations {
  val groups = Set(Group(1, Set(1, 2, 3, 4, 5)),
                   Group(2, Set(1, 2, 3)),
                   Group(3, Set(1, 2, 3, 4)))
  val rules = Set(Rule(GroupOption(1, 1), GroupOption(2, 2)),
                  Rule(GroupOption(1, 1), GroupOption(3, 4)),
                  Rule(GroupOption(2, 2), GroupOption(3, 2)))
  def test = ComputeCombinations.allCombinations(groups, rules)
}
Daniel
I made a comment in a new post: http://stackoverflow.com/questions/1154565/number-of-combinations-in-configurator/1182079#1182079
Daniel
A: 

Comment to Daniel's post:

Your algorithm looks good but I couldn't convince myself it really worked, so I installed scala and made some tests. Unfortunaly I don't get correct results.

For example consider this case:

Group 1:
a1 a2 a3 a4 a5

Group 2:
b1 b2 b3

Group 3:
c1 c2 c3 c4

Group 4:
d1 d2 d3 d4 d5

Rules:
a1 <-> b2
a1 <-> c2
b2 <-> c2
b2 <-> d1
a2 <-> d2

I configured my basic sieve algorithm with this configuration and got the following results (227 combinations):

Without rules => 300
Rules: [1] => 20
Rules: [2] => 15
Rules: [3] => 25
Rules: [4] => 20
Rules: [5] => 12
Order: 1 => 208 (diff: -92)
Rules: [1, 2] => 5
Rules: [1, 3] => 5
Rules: [2, 3] => 5
Rules: [1, 4] => 4
Rules: [2, 4] => 1
Rules: [3, 4] => 5
Rules: [1, 5] => 0
Rules: [2, 5] => 0
Rules: [3, 5] => 1
Rules: [4, 5] => 0
Order: 2 => 234 (diff: 26)
Rules: [1, 2, 3] => 5
Rules: [1, 2, 4] => 1
Rules: [1, 3, 4] => 1
Rules: [2, 3, 4] => 1
Rules: [1, 2, 5] => 0
Rules: [1, 3, 5] => 0
Rules: [2, 3, 5] => 0
Rules: [1, 4, 5] => 0
Rules: [2, 4, 5] => 0
Rules: [3, 4, 5] => 0
Order: 3 => 226 (diff: -8)
Rules: [1, 2, 3, 4] => 1
Rules: [1, 2, 3, 5] => 0
Rules: [1, 2, 4, 5] => 0
Rules: [1, 3, 4, 5] => 0
Rules: [2, 3, 4, 5] => 0
Order: 4 => 227 (diff: 1)
Rules: [1, 2, 3, 4, 5] => 0
Order: 5 => 227 (diff: 0)

***Combinations: 227***

However using this code in scala:

  val groups = Set(Group(1, Set(1, 2, 3, 4, 5)),
                   Group(2, Set(1, 2, 3)),
                   Group(3, Set(1, 2, 3, 4)),
                   Group(4, Set(1, 2, 3, 4, 5)))

  val rules = Set(Rule(GroupOption(1, 1), GroupOption(2, 2)),
                  Rule(GroupOption(1, 1), GroupOption(3, 2)),
                  Rule(GroupOption(2, 2), GroupOption(3, 2)),
                  Rule(GroupOption(2, 2), GroupOption(4, 1)),
                  Rule(GroupOption(1, 2), GroupOption(4, 2)))

I got the answer 258.

I've checked the calculations in the sieve method and they seem to be right. Perhaps it is possible to fix your algorithm? I can't really put my finger on what is wrong.

Daniel
I'm now awake, and, indeed, it does not work. I'll edit my post.
Daniel
+3  A: 

Ok, I can't get around 2 ^ N, but I can reduce the sample set. To do this, we'll compute "Composed Constraints". A Composed Constraint is a constraint where, if the all options on the left side is selected, then none of the options on the right side can be selected, but no other constraint based on the left side options may apply.

We need to compute a set of all possible Composed Constraints from a set of constraints. While not necessary, we'll "fix" the existing constraints by swapping left and right hands if the group of the right hand is smaller than the group of the left. This may reduce some composed constraints, though better heuristics are possible for swapping.

We also need to compute a "minimum set" of options that can be arbitrarily chosen for each group. This minimum set is computed by removing from the list of available options any options appearing on the left hand of a composed constraint.

An algorithm follows, but I'm not proving it computes CCs correctly. I will prove that, if it does, then they can be used to compute the number of possible combinations.

  1. Fix the constraints so that the group of the left hand is less than, or equal to the group of the right hand.
  2. Compose the constraints:

    1. Sort constraints by left hand
    2. Sequentially, for each constraint:

      1. Fold the constrain with all constraints that follow it with the same left hand, turning x1 <-> y1, x1 <-> y2 ... x1 <-> yN into Set(x1) <-> Set(y1 ... yN)
      2. Compose the folded constraint with each already folded constraint, if:
        • x1 is not in the right hand of the already folded constraint
        • x1 is not in the same group of any element in the left hand
      3. Add the folded constraint and all it's compositions to the set of folded constraints
  3. Compute the Minimum Set, by taking all options and removing the ones appearing in the left hand of the fixed constraints.

Now you can compute the number of combinations with the formula below. Let's call CC a composed constraint. Then the number of combinations is:

C(Mininum Set) + CCC1 + ... + CCCn

Where:

  • C(Minimum Set) is the number of combinations possible with the minimum set.
  • CCCx is the number of combinations possible by taking the minimum set, replacing any groups for which there's an option on the left hand of CCx with that option, and then removing any options on the right hand of CCx.

Note that the expression is purely additive. This means that, for that expression to yield the expected result, the following two conditions must be true:

  1. No two terms of it may contain the same combination.
  2. All combinations must be accounted for by these terms.
  3. No invalid combinations may be yielded by any term.

For the first proof, note that there are no two distinct CCs with the same left hand. If two CCs had the same left hand but different right hands, that would mean there was an addition constraint that must apply to one of the CCs, or an invalid constraint being applied to the other.

Since there are no two CCs with the same left hand, and the minimum set does not contain the left hand of any CC by definition, then any two CC can be distinguished by at least one option which is selected for one but not for the other. Therefore, no two CCs may yield the same combination.

For the second proof, note that the set of CCs contains, by definition, all valid combinations for options on the left hand.

Assume that there is one combination that does not appear in either the minimum set nor the set of CCs. If this combination does not contain any left-hand option, then it must be a combination from the minimum set, by definition. Therefore, it must contain options from the left hand.

Since the set of CCs contains all valid combinations for the left hand, then there is one CC with the same left-hand options. This combination must have, therefore, an option that is not included in any combination for that CC. But the only options not included in that CC are the ones appearing in the left hand for other CCs, and the ones that must be excluded from it by constraints. Since neither may be the case, then this combination cannot exist.

For the third proof, let's first consider the minimum set. The minimum set does not contain any option in the left hand of any group. Since all constraints are between one left and one right hand option, no constraint applies to the minimum set.

Now let's consider the CCs. A CC has a valid combination of left hand options by definition. Any option that is incompatible with that left hand must appear in the right hand, and any option from that right hand must be removed from the minimum set. Since no options on the minimum set where incompatible between themselves to begin with, there can be no unsatisfied constraint in any combination on a CC.

And that ends the proof.

Let's see how this applies with an example from the comments:

G1: x1, x2, x3 
G2: y1, y2 
G3: z1, z2, z3 

R1: x1 <-> y2 
R2: x3 <-> y2 
R3: y1 <-> z1 
R4: y2 <-> z2 
R5: y2 <-> z3

CC1: {x1} <-> {y2}
CC2: {x3} <-> {y2}
CC3: {y1} <-> {z1}
CC4: {x1, y1} <-> {y2, z1}
CC5: {x3, y1} <-> {y2, z1}
CC6: {y2} <-> {z2, z3}

Let's ponder briefly on composed groups not in the list:

R1&R2: {x1, x3} <-> {y2} -- not in the list because x1 and x3 belongs to the same
                            group
R1&R5: {x1, y2} <-> {y2} -- not in the list because the left hand of R2, y2
                            appears in the right hand of R1

Now, let's see what options are possible in each set:

Minimum Set: (x2), (), (z1, z2, z3)    
CC1: (x1), (), (z1, z2, z3) -- replace G1 with x1, remove y2 from G2
CC2: (x3), (), (z1, z2, z3) -- replace G1 with x3, remove y2 from G2
CC3: (x2), (y1), (z2, z3)   -- replace G2 with y1, remove z1 from G3
CC4: (x1), (y1), (z2, z3)   -- replace G1 with x1, G2 with y1, remove y2 and z1
CC5: (x3), (y1), (z2, z3)   -- replace G1 with x3, G2 with y1, remove y2 and z1
CC6: (x2), (y2), (z1)       -- replace G2 with y2, remove z2 and z3 from G3

Now, let's add things up:

C(Minimum Set) = 1 * 0 *3 = 0
CCC1 = 1 * 0 * 3 = 0
CCC2 = 1 * 0 * 3 = 0
CCC3 = 1 * 1 * 2 = 2
CCC4 = 1 * 1 * 2 = 2
CCC5 = 1 * 1 * 2 = 2
CCC6 = 1 * 1 * 1 = 1

C(Minimum Set) + CCC1 + CCC2  + CCC3 + CCC4 + CCC5 + CCC6
0              + 0    + 0     + 2    + 2    + 2    + 1    = 7

I'll add one further thought here. Even though there are only 6 CCCs for 5 rules, much less than the 32 terms expected otherwise, these CCCs are computed with 2^N worst time, since, for each rule, you must compare and combine it to all CCCs created so far. You may think of them as binary numbers, where a bit is set if the rule is being combined, and not set if not.

However, incompatible combinations are discard right away, so that for each new rule being combined, no time is lost on combinations already deemed invalid. Also, by sorting the rules before-hand, successive rules in the same group may be discarded without even testing for incompatibility, with the right data structures.

As this particular example shows, the average time can be much better than 2^N.

Alternative Algorithms and Considerations

There's some talk of 2-SAT and 3-SAT going around. It seems, to me, that this is a 2-SAT problem, in the sense that every constrain a <-> b is actually a clause "!a || !b". So all constrains together can just be written as "(!x1 || !y2) && (!x1 || !z4) && (!y2 && !z3)", etc. That means you can "solve" it in the sense that you can find out if there is a boolean assignment to each option that will turn this true. There is a linear algorithm for this by Aspall, Plass and Tarjan, with a slide presentation here.

But knowing whether the constrains can be solved or not is not what was asked. What was asked is the number of ways all options can be set while keeping the 2-SAT problem true.

Now, there are efficient algorithms for counting the number of of ways to satisfy a 2-SAT problem. For instance, this paper presents an algorithm that runs in 1.2561n. But even this won't help us, as we need to know what the solutions are to be able to compute the number of combinations that satisfy that solution.

According to Wikipedia, this paper has an algorithm which efficiently enumerate all solutions, which is what we want. But if counting is already exponential, so will be enumerating. Better than 2n, perhaps, but still exponential.

If we do enumerate all solutions to the 2-SAT problem, the number of combinations for each group is given by 1 multiplied by the number of free options, options that do not appear in any constraint, of each group for which no option was selected by the solution.

For example, taking the previous set of groups and constraints. The 2-SAT problem, including mutual exclusion, is:

(!x1 || !y2) && (!x3 || !y2) && (!y1 || !z1) && (!y2 || !z2) && (!y2 || !z3) && (!x1 || !x3) && (!y1 || !y2) && (!z1 || !z2) && (!z1 || !z3) && (!z2 || !z3)

The first line are the five rules. The second line is the mutual exclusion of all options in the same group that appear in the constraint rules.

The solutions to this 2-SAT problems are:

x1    x3    y1    y2    z1    z2    z3    Combinations
true  false true  false false true  false 1
true  false true  false false false true  1
true  false true  false false false false 0
true  false false false true  false false 0
true  false false false false true  false 0
true  false false false false false true  0
true  false false false false false false 0
false true  true  false false true  false 1
false true  true  false false false true  1
false true  true  false false false false 0
false true  false false true  false false 0
false true  false false false true  false 0
false true  false false false false true  0
false true  false false false false false 0
false false true  false false true  false 1
false false true  false false false true  1
false false true  false false false false 0
false false false true  true  false false 1
false false false true  false false false 0
false false false false true  false false 0
false false false false false true  false 0
false false false false false false true  0
false false false false false false false 0

In the first two solutions, there are no groups without a selected option, so the number of combinations is 1. The third solution has no option selected for the group G3, so we multiply 1 by 0. In the lines that start with "false false", there are no option selected for group G1, and one free option: x2. So we multiply 1 by 1 for them, and by 0 if there are no option selected for G2 or G3 (in which case the number of free options is 0).

Now, there is the question of how do I enforce one option being chosen in each group and still claim to be 2-SAT. The problem, as stated, has two implicit constraints: for each group, there must be one, and only one, option selected. These two constrains can be written as:

    x1 || x2 || x3 (for group x with options x1 .. x3)
    (!x1 || !x2) && (!x1 || !x3) && (!x2 || !x3)

The later constrain being 2-SAT, the former being 3-SAT for any non-trivial case. As it happens, I do not enforce the first constraint, but the count then becomes 0. The counting algorithm should go like this:

  • For the constraint-less combinations, multiply the number of constraint-less options in each group by each other.
  • For the constraint-full combinations, add the result of the following counts:
    • For each solution, multiply the number of constraint-less options in each group without an option evaluated to "true" by each other.

So, for every group in which there is at least one constraint-less option, the selection is implicit and anonymous. For every group in which all options are part of some constraint, if no option was selected then the count for that group becomes 0, and, therefore, the number of combinations for that solution becomes 0 as well.

This feels like cheating the problem out of a valid > 2-SAT constraint. After all, if this was possible, then the 3-SAT problem could be solved just by enumerating the solutions to the 2-SAT part of it, and then discarding every one which does not satisfy the 3-SAT part of it. Alas, there is one fundamental difference I can identify:

  • All predicates not solved by the 2-SAT part of the problem are free from any further constraints.

Given this rather restrictive constraint on the clauses, we can solve this problem by enumerating solutions to the 2-SAT explicit constraints.

If anyone wants to pursue this further, go ahead. I'm satisfied with the improvement on the 2n solution I suggested.

Daniel
Is it possible to do better than exponential time? Maybe this is equivalent to counting the number of solutions to 2-SAT?
Jules
sdcvvc
Well, I couldn't quite agree with your reduction to 3SAT. :-) I'll complement the answer.
Daniel
BTW: Counting the number of solutions to 2SAT is #P-complete (en.wikipedia.org/wiki/Sharp-P-complete). If this was polynomially solvable, then P=NP.
sdcvvc
I'm trying to figure out how the combinations in the folded groups are calculated.For example:G1: x1, x2, x3G2: y1, y2G3: z1, z2, z3R1: x1 <-> y2R2: x3 <-> y2R3: y1 <-> z1R4: y2 <-> z2R5: y2 <-> z3This would give the following CCs:(x1) <-> (y2)(x3) <-> (y2)(y1) <-> (z1)(x1, y1) <-> (y2, z1)(x3, y1) <-> (y2, z1)(y2) <-> (z2, z3)Using the sieve principle i get 7 possible combinations. No matter how I attempt to calculate the composed options, I can't get the same answer.
Daniel
sdcwc: polynomial? I said it was exponential.
Daniel
@sdcwc: Ah, I get what the confusion is. I said that the constraints are 2-SAT, but you don't need to solve them, but enumerate them. I said this was exponential, though on can do better than 2^n. So, I guess we can't collect the P=NP award yet, eh? ;-)
Daniel
My reduction was from 3SAT, not to 3SAT.I don't understand why you mentioned linear algorithm checking 2SAT satisfability in your answer. It won't help here, since to use it you have to reduce this to 2SAT, and this reduction is as fast as checking satisfability by brute force. This is NP-complete, and a SAT solver should be used instead.
sdcvvc
I mentioned it because the first comment to this answer says "Maybe this is equivalent to counting the number of solutions to 2-SAT". So, in answer to that, I said "this here is equivalent to 2-SAT, but it does not answer the question posed". It's all in the text, you just have to read it.
Daniel
@Daniel: funny, you got the CCs correct in this example -- I myself got them incorrect twice. :-) But I get the correct result. I'll edit my answer with the example.
Daniel
Thank you, now I understand how the CCR values are calculated! I'm gonna make a little test program now.
Daniel
I don't understand how you're calculating the CCR values. A definition would help. I'm not sure I follow your analysis, either ... why NlogN to combine them? I think it's N^2.
Zac Thompson
@Zac Thompson: I'll try to improve the answer. Check it again (once you see it has been edited :).
Daniel
@Zac Thompson: I think the algorithm is broken. I haven't had time to review it, but I'm inclined to think that the examples worked because they had a certain symmetry to them. The problem seems to be that CCR here is half-way from my first algorithm, and half-way from the Combinations algorithm. I need to "push" it either way. And this time I'll write the code and submit it to a test system! :-)
Daniel
Your answer looks really impressive now. :) I haven't yet tested the latest change in the algorithm, but since the bounty is running to it's end so soon, I'll accept your answer.
Daniel
+1  A: 

This might not be a directly useful answer, so feel free to ignore it... however; I'm currently working no a similar system myself; and frankly other than for trivial examples I'm not sure it is useful to try to calculate the number of valid combinations. As an example, the models I'm working on at the moment have (for example) 18000 candidate items spread over 80+ selections, some single-select / some multi-select. Beyond the smallest models, there is no benefit in knowing the number, as you would simply never write it as a full truth table; you are pretty-much forced to run the rules (i.e. remove anything that is no longer valid, auto-select anything as appropriate, and check that no rules are broken) on-demand. Which isn't necessarily a problem; my current code processes this model (as a web-service) in ~450ms, and most of that is actually the time spent processing xml for input/output. If the input/output wasn't xml, I think it would be ~150ms, which is cool.

I'd love to convince my employer to open-source the engine; that is a battle for another day, though.

Marc Gravell
A: 

Your problem is rather infeasible.

  • Counting the number of solutions is #P-complete, even if you restrict every group of radioboxes to two options
  • Checking if there is ANY choice of options consistent with constraints is NP-complete
  • Checking if there is ANY choice of options consistent with constraints can be done quite fast, if you restrict every group of radioboxes to two options (2SAT)

So in general don't count on a polynomial algorithm; existence of such algorithms would imply P=NP.

What you can do:

  • Use an approximation algorithm. According to Wikipedia article I linked, they are often suspectible to them.
  • Use a SAT solver http://en.wikipedia.org/wiki/SAT_solver or a related tool for counting (unfortunately I don't know any); people have created many heuristics and that programs will be in general much faster than your homemade solution. There are even SAT competitions, so this field is currently expanding very fast.
  • Check if you need such generality. Maybe your problem has an additional assumptions, and this will change its complexity.

Proofs:

  1. Counting the number of solutions is easily shown to be #P, so it's enough to reduce 2SAT to this. Take some 2SAT instance, like

(p1 or not p2) and (p2 or not p3)

Allow the user to choose values of p1, p2, p3. You can easily form constraints that will force this to be solvable. So the number of possible configurations = number of possible assignments of p1, p2, p3 making the Boolean formula true.

  1. You can easily check if some choice of options is allowed or not, so this is NP, so it's enough to reduce 3SAT to this. Take some 3SAT instance, like

(p1 or p2 or not p3) and (p2 or not p1 or p4)

Give options:

group p1 p1true p1false

group p2 p2false p2true

group p3 p3false p3true

group p4 p4false p4true

group clause_1 c1a c1b c1c

group clause_2 c2a c2b c2c

clause_1 will be controlling the first clause: (p1 or p2 or not p3). Precisely, c1a will be checkable if p1true was chosen, c1b will be checkable if p2true was chosen, c1c will be checkable if p3false was chosen. So the constraints are:

p1false <-> c1a

p2false <-> c1b

p3true <-> c1c

Same with clause_2, constaints are

p2false <-> c2a

p1true <-> c2b

p4false <-> c2c

If the user will be able to choose all answers (so the number of configurations is > 0), he'll prove that there is some valuation of variables p1, ..., p4 that make the 3SAT instance true. And conversely, if the user won't be able to choose answers consistent with assumptions (the number of available configurations = 0), the 3SAT instance will not be satisfable. So knowing if the answer is > 0 means knowing if 3SAT instance is solvable.

Of course this reduction is polynomial-time. End of proof.

If you aren't satisfied with the fact that answer might be 0: it is still NP-hard if you disregard such configurators. (You would add some "bogus" option to all groups and allow exponentially many choices if "bogus" was not chosen. This is more complex to explain, so I'll do it on request.)

sdcvvc
A: 

This is mentioned briefly in sdcvvc's excellent answer above, but might a Monte Carlo approximation be good enough? Generate N random configurations (where N is chosen such that the power of your experiment is high enough: I don't know enough to help you here), then test how many of them are compatible with your constraints. Extrapolate that proportion to the rest of your configuration space.