Have a class which keeps the item:weight pairs (key=item:value=weight) in a hash table.
The class should also maintain a total_weight
variable, which is the sum of all the weights in the hash table. The class' methods to add_item
, remove_item
, and update_weight
for an item should keep the total_weight updated. This avoids having to recalculate the total for every choice.
To choose an item:
Use a random number such that 1<=random_number<=total_weight
.
Iterate over the item:weight pairs in the hash table, summing the weights until the random number is <= that running sum. When that happens, the key of the pair you're on is the chosen item.
This is like rolling an imaginary die whose size is the sum of all the weights. For every roll, each item has its own range of numbers on the die, with the size of each range equal to its item's weight. If the roll result falls within an item's range, that item is the chosen one.
Editing to add the following sample code after the request in the comment below. Tested this with Python 2.5.2:
from random import randint # Import randint function from random module.
class WeightedCollection(object):
def __init__(self):
self.total_weight = 0
self.items = {} # This is a python dictionary == a hash table
def add_item(self, item, weight):
self.items[item] = weight
self.total_weight += weight
def remove_item(self, item):
self.total_weight -= self.items[item] # Subtracts the weight.
del(self.items[item])
def update_weight(self, item, new_weight):
self.total_weight += (new_weight - self.items[item])
self.items[item] = new_weight
def get_random_item(self):
''' Returns random selection but weighted by item weights. '''
# Result of call below is 1 <= random_number <= self.total_weight...
random_number = randint(1, self.total_weight)
sum_so_far = 0
# For every item and its weight...
for item, weight in self.items.iteritems():
sum_so_far += weight
if random_number <= sum_so_far:
return item
# Usage demo...
questions = WeightedCollection()
questions.add_item('What is your name?', 1)
questions.add_item('What is your favorite color?', 50)
questions.add_item('What is the meaning to life?', 100)
print 'Here is what the dictionary looks like:'
print questions.items
print ''
print "Total weight:", questions.total_weight
print ''
print 'Some sample random picks...'
for i in range(5):
print questions.get_random_item()
And here is the output:
Here is what the dictionary looks like:
{'What is the meaning to life?': 100, 'What is your name?': 1, 'What is your favorite color?': 50}
Total weight: 151
Some sample random picks...
What is your favorite color?
What is the meaning to life?
What is the meaning to life?
What is your favorite color?
What is the meaning to life?