Here we will engineer a random number generator that has a distribution that favors low values. You can use it to prefer items at the beginning of a list. To decrease the odds of something being selected, move that item down the list. You have a few options for how you want to move the item down the list. Lets review the random variable transformation first.
By applying the following function to a uniform random variable between 0 and 1:
index = Int(l*(1-r^(0.5)) # l=length, r=uniform random var between 0 and 1
You get a cool distribution that drastically reduces the odds of a larger index
p(0)=0.09751
p(1)=0.09246
p(2)=0.08769
p(3)=0.08211
p(4)=0.07636
p(5)=0.07325
p(6)=0.06772
p(7)=0.06309
p(8)=0.05813
p(9)=0.05274
p(10)=0.04808
p(11)=0.04205
p(12)=0.03691
p(13)=0.03268
p(14)=0.02708
p(15)=0.02292
p(16)=0.01727
p(17)=0.01211
p(18)=0.00736
p(19)=0.00249
Here is the distribution for a list of size 2
0.75139
0.24862
Size 3
0.55699
0.33306
0.10996
Size 4
0.43916
0.31018
0.18836
0.06231
Now lets discuss the two options for moving the items down the list. I tested two:
I created a simulation to pick from a list and examine the standard deviation of the count that each item was selected. The lower the standard deviation the better. For example, 1 simulation for a list of 10 items where 50 selections where made created the spread:
{"a"=>5, "b"=>5, "c"=>6, "d"=>5, "e"=>4, "f"=>4, "g"=>5, "h"=>5, "i"=>6, "j"=>5}
The Standard Devation for this simulation was
0.63
With the ability to run a simulation, I then computed some meta-statistics by running the simulation 500 times and providing the average standard deviation for each method: ToEnd and Sort. I expected for the standard deviation to be high for low # of picks, but in fact for the ToEnd algorithm the Standard Deviation increased with the number of picks. The sort method fixed this.
Testing ["a", "b", "c", "d", "e"]
-------------------------
Picks ToEnd (StdDev) Sort (StdDev)
5 0.59 0.57
10 0.76 0.68
15 0.93 0.74
20 1.04 0.74
25 1.20 0.73
30 1.28 0.73
35 1.34 0.74
40 1.50 0.75
45 1.53 0.75
45 1.56 0.77
80 2.04 0.79
125 2.52 0.74
180 3.11 0.77
245 3.53 0.79
320 4.05 0.75
405 4.52 0.76
500 5.00 0.78
Here are some test results for a larger set.
Testing ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j"]
-------------------------
Picks ToEnd (StdDev) Sort (StdDev)
10 0.68 0.65
20 0.87 0.77
30 1.04 0.80
40 1.18 0.82
50 1.30 0.85
60 1.43 0.84
70 1.57 0.87
80 1.65 0.88
90 1.73 0.87
90 1.71 0.87
160 2.30 0.89
250 2.83 0.88
360 3.44 0.88
490 3.89 0.88
640 4.54 0.87
810 5.12 0.88
1000 5.66 0.85
With a good test framework, I couldn't resist trying a different random number transformation. My assumption was if I take the cube root instead of square root of x that my standard deviation would decrease. It did in fact, but I was worried that this would decrease the randomness. Here you can observe a few simulations when the formula is changed to
index = Int(l*(1-r^(0.33)) # l=length, r=uniform random var between 0 and 1
Now inspect the actual picks. As I thought, its very weighted to the beginning of the list at first. If you want to weight this heavily, you should probably randomize your list before you start.
StdDev = 0.632455532033676
{"a"=>10, "b"=>10, "c"=>11, "d"=>9, "e"=>10}
a d e c b c e b a d b e c a d d e b a e e c c b d a d c b c e b a a d d b e a e a b c b d c a c e c
StdDev = 0.0
{"a"=>10, "b"=>10, "c"=>10, "d"=>10, "e"=>10}
b c d a a d b c b a d e c d e c b e b a e e d c c a b a d a e e b d b a e c c e b a c c d d d a b e
StdDev = 0.632455532033676
{"a"=>9, "b"=>10, "c"=>10, "d"=>10, "e"=>11}
b d a e b c a d c e e b a c a d d c b c e d a e b b a c d c d a e a e e b d c b e a b c b c d d e e
StdDev = 0.0
{"a"=>10, "b"=>10, "c"=>10, "d"=>10, "e"=>10}
a b e d c e a b d c b c c a d b e e b e a d d c e a d b b c c a a a b e d d e c c a b b e a c d e d
StdDev = 0.632455532033676
{"a"=>11, "b"=>10, "c"=>9, "d"=>10, "e"=>10}
b a d c d c a e b e a e b c d b c a a d e e d c d e c b a b b e d c d b c e a a a d b c e b e a d a