views:

76

answers:

1

I have a scenario where I need to show a different page to a user for the same url based on a probability distribution,

so for e.g. for 3 pages the distribution might be

page 1 - 30% of all users
page 2 - 50% of all users
page 3 - 20% of all users

When deciding what page to load for a given user, what technique can I use to ensure that the overall distribution matches the above?

I am thinking I need a way to choose an object at "random" from a set X { x1, x2....xn } except that instead of all objects being equally likely the probability of an object being selected is defined beforehand.


Thanks for the input everyone, after doing some prototyping, this is what I ended up using

private static int RandomIndexWithPercentage(Random random, int[] percentages) {
    if (random == null) {
        throw new ArgumentNullException("random");
    }

    if (percentages == null || percentages.Length == 0) {
        throw new ArgumentException("percentages cannot be null or empty", "percentages");
    }

    if(percentages.Sum() != 100) {
        throw new ArgumentException("percentages should sum upto 100");
    }

    if (percentages.Any(n => n < 0)) {
        throw new ArgumentException("percentages should be non-negative");
    }

    var randomNumber = random.Next(100);
    var sum = 0;
    for (int i = 0; i < percentages.Length; ++i) {
        sum += percentages[i];
        if (sum > randomNumber) {
            return i;
        }
    }

    //This should not be reached, because randomNumber < 100 and sum will hit 100 eventually
    throw new Exception("Unexpected");
} 
+5  A: 

Generate a number 0-9. If the number is less than 3, give them page one. If it's less than 8, give them page two, or else give them page three.


Some code, to get you started:

private int ChoosePage()
{
    int[] weights = new int[] { 3, 5, 2 };
    int sum = 0;
    int i;
    for (i = 0; i < weights.Length; i++)
        sum += weights[i];
    int selection = (new Random()).Next(sum);
    int count = 0;
    for (i = 0; i < weights.Length - 1; i++)
    {
        count += weights[i];
        if (selection < count)
            return i;
    }
    return weights.Length - 1;
}

Note that weights doesn't have to add up to anything in particular. If sum = 100, then weight[i] is th percent chance of getting page i. If it doesn't, however, it's just relative - if weight[i] is twice weight[j], then page i will get twice as many hits as page j. This is nice because you can arbitrarily increase or decrease page traffic without recalculating anything. Alternatively, you could make sure the sum is always N, and hardcode N in, rather than summing all the values every time. There are a lot more optimizations you could do, I'm sure.

Daniel Rasmussen
I think number between 0-100 would be better to take care of percentages like 19 which would end up in decimal land if i try to translate to the 0-10 range.
Sijin
See my edit - all percentages are converted to weights, anyways, so whether it's out of 10 or 100 is irrelevant, if that's how you choose to do it. (If you need %s, then yes - stick with 100.)
Daniel Rasmussen
Also note you shouldn't do `(new Random())` for every random number - it should probably be initialized once then stored in a class variable somewhere.
Daniel Rasmussen
Yup, I saw the issue with (new Random()) when testing, so I take that as a parameter to the function now and initialize that at a global level for the entire run. Thanks.
Sijin