ansaurus

Question

Answer 1

+4 A:

def weighted_choice(choices):
   total = sum(w for c,w in choices)
   r = random.uniform(0, total)
   upto = 0
   for c, w in choices:
      if upto+w > r:
         return c
      upto += w
   assert False, "Shouldn't get here"

Ned Batchelder 2010-09-09 19:08:40

I don't know why I thought I had to sort the weights and go through them in order...this is better.

Colin 2010-09-09 19:22:36

Answer 2

+2 A:

Crude, but may be sufficient:

import random
weighted_choice = lambda s : random.choice(sum(([v]*wt for v,wt in s),[]))

Does it work?

# define choices and relative weights
choices = [("WHITE",90), ("RED",8), ("GREEN",2)]

# initialize tally dict
tally = dict((c[0],0) for c in choices)

# tally up 1000 weighted choices
for i in xrange(1000):
    tally[weighted_choice(choices)] += 1

print tally.items()

Prints:

[('WHITE', 904), ('GREEN', 22), ('RED', 74)]

Assumes that all weights are integers. They don't have to add up to 100, I just did that to make the test results easier to interpret.

Paul McGuire 2010-09-09 19:13:04

Nice, I'm not sure I can assume all weights are integers, though.

Colin 2010-09-09 19:21:28

Answer 3

+1 A:

I'd require the sum of choices is 1, but this works anyway

def weightedChoice(choices):
    # Safety check, you can remove it
    for c,w in choices:
        assert w >= 0


    tmp = random.uniform(0, sum(c for c,w in choices))
    for choice,weight in choices:
        if tmp < weight:
            return choice
        else:
            tmp -= weight
     raise ValueError('Negative values in input')

phihag 2010-09-09 19:14:47

Out of curiosity, is there a reason you prefer random.random() * total instead of random.uniform(0, total)?

Colin 2010-09-09 19:23:51

@Colin No, not at all. Updated.

phihag 2010-09-09 19:27:16

You traverse three times over iterable. This might be not supported by iterable.

liori 2010-09-09 19:30:29

That's a good point. I've only been passing in lists of tuples, so I hadn't uncovered that bug yet.

Colin 2010-09-09 20:02:46

@liori You're right. However, weightedChoice can not be computated without storing all the items of the iterable in a list anyway, so the input should be a list.

phihag 2010-09-09 20:23:37

I think it is actually possible. http://utopia.duth.gr/~pefraimi/research/data/2007EncOfAlg.pdf It is actually pretty simple... But who cares...

liori 2010-09-09 20:56:21

@liori I do care, and you're right: weightedChoice *can* be computed with one iterator pass only. However, this seems to require more than 1 call to the pseudo random generator.

phihag 2010-09-10 14:04:11

ansaurus

tags:

views:

answers:

A weighted version of random.choice

related questions