ansaurus

Question

Answer 1

+6 A:

There is a straightforward algorithm for picking an item at random, where items have individual weights:

1) calculate the sum of all the weights

2) pick a random number between 0 and less than the sum weights

3) go through the items in order, subtracting their weight from your random number, until you get the item where the random number is less than that item's weight

Pseudo-code illustrating this:

int sum_of_weight = 0;
for(int i=0; i<num_choices; i++) {
   sum_of_weight += choice_weight[i];
}
int rnd = random(sum_of_weight);
for(int i=0; i<num_choices; i++) {
  if(rnd < choice_weight[i])
    return i;
  rnd -= choice_weight[i];
}
assert(!"should never get here");

This should be straightforward to adapt to your boost containers and such.

If you do not know how many items in the list, then there's a very neat algorithm called reservoir sampling that can be adapted to be weighted.

Will 2009-11-19 08:00:14

As an optimization you could use cumulative weights and use a binary search. But for only three different values this is probably overkill.

sellibitze 2009-11-19 10:02:32

Answer 2

A:

Choose a random number on [0,1), which should be the default operator() for a boost RNG. Choose the item with cumulative probability density function >= that number:

template <class It,class P>
It choose_p(It begin,It end,P const& p)
{
    if (begin==end) return end;
    double sum=0.;
    for (It i=begin;i!=end;++i)
        sum+=p(*i);
    double choice=sum*random01();
    for (It i=begin;;) {
        choice -= p(*i);
        It r=i;
        ++i;
        if (choice<0 || i==end) return r;
    }
    return begin; //unreachable
}

Where random01() returns a double >=0 and <1. Note that the above doesn't require the probabilities to sum to 1; it normalizes them for you.

p is just a function assigning a probability to an item in the collection [begin,end). You can omit it (or use an identity) if you just have a sequence of probabilities.

wrang-wrang 2009-11-19 08:05:29

Answer 3

A:

A really naive way would be to fill an array of 100 numbers and put 90 1's, 6 2's and 4 3's then random(100) and use that number as an index for the array. To make it more interesting you can "shuffle" the array and see if that helps.

dasickis 2009-11-19 08:10:25

Answer 4

+1 A:

Build a bag (or std::vector) of all the items that can be picked.
Make sure that the number of each items is proportional to your weighting.

Example:

1 60%
2 35%
3 5%

So have a bag with 100 items with 60 1's, 35 2's and 5 3's.
Now randomly sort the bag (std::random_shuffle)

Pick elements from the bag sequentially until it is empty.
Once empty re-randomize the bag and start again.

Martin York 2009-11-19 10:48:43

if you have a bag of red and blue marbles and you select a red marble from it and _don't_ replace it is the probability of selecting another red marble still the same? In the same way, your statement "Pick elements from the bag sequentially until it is empty" produces a totally different distribution than intended.

ldog 2010-09-23 18:14:23

@ldog: I understand your argument but we are not looking for true randomness we are looking for a particular distribution. This technique guarantees the correct distribution.

Martin York 2010-09-23 19:32:09

my point exactly is that you do not correctly produce distribution, by my previous argument. consider the simple counter example, say you put you have an array of 3 as `1,2,2` producing 1 1/3 of the time and 2 2/3. Randomize the array, pick the first, lets say a 2, now the next element you pick follows the distribution of 1 1/2 the time and 2 1/2 the time. Savvy?

ldog 2010-09-23 22:55:36

ansaurus

tags:

views:

answers:

Weighted random numbers

related questions