ansaurus

Question

How can I fairly choose an item from a list?

Answer 1

+3 A:

Pseudocode for special random number generator:

rng is random number generator produces uniform integers from [0, max)
compute m = max modulo length of attendee list
do {
    draw a random number r from rng
} while(r >= max - m)
return r modulo length of attendee list

This eliminates the bias to the front part of the list. Then

put the attendees in some data structure indexable by integers
for every prize in the prize list
draw a random number r using above
compute index = r modulo length of attendee list
return the attendee at index

In C#:

public NextUnbiased(Random rg, int max) {
    do {
        int r = rg.Next();
    } while(r >= Int32.MaxValue - (Int32.MaxValue % max));
    return r % max;
}

public Attendee SelectWinner(IList<Attendee> attendees, Random rg) {    
    int winningAttendeeIndex = NextUnbiased(rg, attendees.Length)
    return attendees[winningAttendeeIndex];
}

Then:

// attendees is list of attendees
// rg is Random
foreach(Prize prize in prizes) {
    Attendee winner = SelectWinner(attendees, rg);
    Console.WriteLine("Prize {0} won by {1}", prize.ToString(), winner.ToString());
}

Jason 2009-12-17 03:19:33

Does the same function exist for the PSRNG class?

John Gietzen 2009-12-17 03:21:24

Also, this is tagged language-agnostic. ;)

John Gietzen 2009-12-17 03:21:56

PSRNG class? Addressed the language-agnostic issue.

Jason 2009-12-17 03:26:51

Sorry, CSPRNG. Like `RNGCryptoServiceProvider`.

John Gietzen 2009-12-17 03:39:00

No, `RNGCryptoServiceProvider` does not have a method `Next`. But you can easily create one. Draw four random bytes (`GetBytes`) from your `RNGCryptoServiceProvider` and smash them together into an `int`.

Jason 2009-12-17 03:42:11

Right, like using the `BitConverter.ToInt32(bytes)` I'm still concerned that there is a bias using the modulus, tho.

John Gietzen 2009-12-17 03:46:30

Oh, i see what you did with the modulus, scraping off the top set of the numbers...

John Gietzen 2009-12-17 03:48:18

Yes, exactly. I eliminated the bias by chopping off the part of the distribution that introduces the bias.

Jason 2009-12-17 03:48:34

Roger, using that method.

John Gietzen 2009-12-17 03:48:50

@John Gietzen: My pleasure.

Jason 2009-12-17 03:49:51

It feels like a waste to throw away perfectly good entropy like that, tho.

John Gietzen 2009-12-17 04:22:05

this method appears to allow 1 user to win all the prizes which may or may not be desirable. Of course you can always remove the winner after prize draw or just shuffle the attendees and draw the first N

jk 2009-12-17 10:32:32

As discussed elsewhere on this post, any PRNG (rather than true RNG) will have a period smaller than the number of possible assignments of gifts. So there's still a bias, though small and harder to analyze than the usual one from just using modular arithmetic. As you said below, 'A "very minuscule" bias will not be good enough for my users.'

thouis 2009-12-17 12:01:38

More on this: http://en.wikipedia.org/wiki/Shuffling#Pseudo-random_shuffles

thouis 2009-12-17 12:06:21

There are RNGs with periods large enough to avoid the problems that you describe. For example, variants of the Mersenne Twister have periods as large as 2^216901 - 1.

Jason 2009-12-17 12:54:05

@jk, I am using that method. Actually, it will be a configurable option. @thouis, I'm using the RNGCryptoServiceProvider which is a high-quality, non-repeating random number source.

John Gietzen 2009-12-17 14:29:50

Answer 2

A:

I suppose one answer would be to assign each item a random value, and take the largest or smallest, drilling down as necessary.

I'm not sure if this is the most efficient, tho...

John Gietzen 2009-12-17 03:20:13

Answer 3

A:

If you're using a good number generator, even with a modulus your bias will be miniscule. If, for instance, you're using a random number generator with 64 bits of entropy and five users, your bias toward the front of the array should be on the order of 3x10^-19 (my numbers may be off, by I don't think by much). That's an extra 3-in-10-quintillion likelihood of the first user winning compared to the later users. That should be good enough to be fair in anyone's book.

Dathan 2009-12-17 03:23:48

A "very minuscule" bias will not be good enough for my users.

John Gietzen 2009-12-17 03:25:48

Answer 4

+1 A:

You can buy truly random bits from a provider, or use a mechanical device.

Hamish Grubijan 2009-12-17 03:32:54

Answer 5

A:

Here you will find Oleg Kiselyov's discussion of purely functional random shuffling.

A description of the linked content (quoted from the beginning of that article):

This article will give two pure functional programs that perfectly, randomly and uniformly shuffle a sequence of arbitrary elements. We prove that the algorithms are correct. The algorithms are implemented in Haskell and can trivially be re-written into other (functional) languages. We also discuss why a commonly used sort-based shuffle algorithm falls short of perfect shuffling.

You could use that to shuffle your list and then pick the first item of the shuffled result (or maybe you'd prefer not to give two prizes two the same person -- then use n initial positions of the result, for n = number of prizes); or you could simplify the algorithm to just produce the first item; or you could take a look around that site, because I could have sworn there's an article on picking one random element from an arbitrary tree-like structure with uniform distribution, in a purely functional way, proof of correctness provided, but my search-fu is failing me and I can't seem to find it.

Michał Marczyk 2009-12-17 03:38:25

Answer 6

+1 A:

Assuming a fairly distributed random number generator...

do {
    i = rand();
} while (i >= RAND_MAX / 5 * 5);
i /= 5;

This gives each of 5 slots

[ 0 .. RAND_MAX / 5 )
[ RAND_MAX / 5 .. RAND_MAX / 5 * 2 )
[ RAND_MAX / 5 * 2 .. RAND_MAX / 5 * 3 )
[ RAND_MAX / 5 * 3 .. RAND_MAX / 5 * 4 )
[ RAND_MAX / 5 * 4 .. RAND_MAX / 5 * 5 )

and discards a roll which falls out of range.

ephemient 2009-12-17 03:39:28

Answer 7

A:

Without truly random bits, you will always have some bias. The number of ways to assign prizes to guests is much larger than any common PRNG's period for even a fairly low number of guests and prizes. As suggested by lpthnc, buy some truly random bits, or buy some random-bit-generating hardware.

As for the algorithm, just do a random shuffle of the guest list. Be careful, as naive shuffling algorithms do have a bias: http://en.wikipedia.org/wiki/Shuffling#Shuffling%5Falgorithms

thouis 2009-12-17 03:39:51

Truly random - has to be truly random and UNIFORM, not Gaussian.

Hamish Grubijan 2009-12-17 03:48:25

I'm not sure what you're referring to. Truly random bits are not Gaussian. http://en.wikipedia.org/wiki/Hardware_random_number_generator

thouis 2009-12-17 11:57:08

Answer 8

A:

You have already seem several perfectly good answers that depend on knowing the length of the list in advance.

To fairly select a single item from a list without needing to know the length of the list in the first place do this:

 if (list.empty()) error_out_somehow
 r=list.first()          // r is a reference or pointer
 s=list.first()          // so is s
 i = 2
 while (r.next() is not NULL)
    r=r.next()
    if (random(i)==0) s=r  // random() returns a uniformly
                           // drawn integer between 0 and i
    i++
 return s

(Useful if you list is stored as a linked list)

To distribute prizes in this scenario, just walk down the list of prizes selecting a random winner for each one. (If you want to prevent double winning you then remove the winner from the participant list.)

Why does it work?

You start with the first item at 1/1
On the next pass, you select the second item half the time (1/2), which means that the first item has probability 1 * (2-1)/2 = 1/2
on further iteration, you select the nth item with probability 1/n, and the chance for each previous item is reduced by a factor of (n-1)/n

which means that when you come to the end, the chance of having the mth item in the list (of n items) is

1/m * m/(m+1) * (m+1)/(m+2) * ... * (n-2)/(n-1) * (n-1)/n = 1/n

and is the same for every item.

If you are paying attention, you'll note that this means walking the whole list every time you want to select an item from the list, so this is not maximally efficient for (say) reordering the whole list (though it does that fairly).

dmckee 2009-12-17 04:24:46

ansaurus

tags:

views:

answers:

How can I fairly choose an item from a list?

related questions