ansaurus

Question

Efficient way to find explanations of a method-test-matrix (mathematical problem)

Answer 1

A:

I do not know whether there are not an exponential number of possible explanations (which would mean that you cannot enumerate them faster than exponential).

However, you could approach this in a dynamic programming style in order to eliminate duplicate effort:

The first level is a list of a single element: the set of all methods
loop for each level:
- loop for each set in this level:
  - if this set is an explanation (oring their tests together gives all 1s):
    - put it into the result list
    - create all possible subsets of this set that have exactly one method less and put them into the next "level"
- remove all duplicates from the next level
until the level is empty

Svante 2010-07-30 16:31:02

Answer 2

+1 A:

If you can settle for approximate probabilities and want something scalable, Gibbs sampling might work. The basic idea is pretty simple: start with the all-rows explanation and repeat the following to sample a bunch of explanations.

Choose a random row.
Flip a coin.
If the coin came up heads, add the row to the explanation (do nothing if it's already there).
If the coin came up tails, attempt to remove the row from the explanation. If the result is not an explanation, put the row back.

In the limit, the fraction of the samples containing a given row converges to its true value. There are some practical implementations under the keywords "Bayesian inference using Gibbs sampling" (you have a uniform prior and observe that for each column, the disjunction of the rows incident with it is true). Since I'm not an expert in this stuff, though, I can't advise you as to the hazards of rolling your own.

2010-07-30 17:16:45

Answer 3

+1 A:

I think this is likely to be an exponential problem. For example if one of the methods has a one in every column, then any subset of methods containing that method is an explanation, and so if there are M methods there are at least 2^(M-1) explanations; similarly if some pair of methods together have a one in any column, then there are at least 2^(M-2) explanations.

Here is method that, while still exponential, I think is faster than enumerating all the explanations, particularly when there are methods with many 1s.

Let T(A,B) be the number of subsets of A (a set of methods) which have at least one 1 in each column in B (a set of columns).

If B is empty, T(A,B) is the number of subsets of A, i.e. 2^#A, where A has #A elements. Otherwise If A is empty, T(A,B) is 0. Otherwise if i is an element of A (e.g. the first one),

T(A,B) = T(A \ {i}, B \ m[i]) + T( A \ {i}, B)

(here A \ {i} is A without i, B \ m[i] is B without any of the columns in method i)

T can be coded quite succinctly as a recursive function.

Finally c[j], the number of times method j occurs in an explanation, is

c[j] = T(A \ {j}, C \ m[j])

where C is the set of all columns.

dmuir 2010-08-02 12:11:07

ansaurus

tags:

views:

answers:

Efficient way to find explanations of a method-test-matrix (mathematical problem)

Setup:

Definition:

Problem:

related questions