ansaurus

Question

Algorithm to generate 1000 distinct integers in the range [0,8000]?

Answer 1

+12 A:

You can use a partial Fisher-Yates shuffle implemented using swaps. One of the nice features of this algorithm is that if you stop after k swaps, the first k numbers are a random sample of size k from the complete set.

Mark Byers 2010-04-04 21:56:34

+1. I was going to suggest just shuffling the whole array; for some reason this obvious optimisation didn't occur to me.

Nick Johnson 2010-04-04 22:28:37

Problem is that you still need to create an array from [0,8000]. Can you create a permutation without the overhead? What if you wanted 10 unique numbers between 1..1,000,000?

Ray 2010-04-04 23:17:34

Technically you could make it a sparse array implemented as a dict instead, and any position which doesn't have a value set is simply taken to be equal to its index.

Amber 2010-04-04 23:23:06

@Dav: Good suggestion. This would save the O(n) construction of the list of integers from [0, 8000].

Mark Byers 2010-04-04 23:26:38

I don't recall mentioning anything about a bitarray?

Amber 2010-04-05 00:21:27

@Dav: Corrected. :P

andras 2010-04-05 19:24:26

@Dav: (great idea: was doing something along those lines for this and in general, k-permutations: http://stackoverflow.com/questions/2394246/algorithm-to-select-a-single-random-combination-of-values/2394308#2394308)

andras 2010-04-05 19:33:20

Answer 2

+2 A:

You could create a list containing the numbers 0 to 8000.

Then looping 1000 times generate a random number between 0 and the length of the list.

Remove that element from the list and add it to an output list.

By removing the element you ensure that your selections are unique.

while (outputList.Count < 1000)
{
    index = random.Next(0, inputList.Count);
    outputList.Add(inputList[index]);
    inputList.RemoveAt(index);
}

ChrisF 2010-04-04 22:07:04

This will run in O(n^2) since RemoveAt for a vector or array is O(n).

Dave Kirby 2010-04-04 23:15:45

Answer 3

+1 A:

This is from Knuth's the Art of Programming (via Jon Bentley's Programming Pearls), implemented in Python:

import random

# randomly select m numbers from n candidates    
def random_select(m, n):
    select = m
    result = []
    for i in xrange(n):
        if random.randint(0, n-i) < select:
            result.append(i)
            select -= 1
    return result

random_select(1000, 8000)

this will generate a list of random numbers in numerical order. It works by iterating over all the integers from 0-n (i.e 0-8000), and randomly selecting them with a probability of(number left to select / number of remaining candidates). It runs in O(n), so do not try it if n is very large compared to m - e.g. selecting ten numbers out of a billion. It uses no memory other than the result list (m) and a few local variables, unlike solutions that rely on shuffling a list of length n.

If you want the result in a random order then shuffle the list afterwards.

Dave Kirby 2010-04-04 22:51:54

The Pythonic variant: import random; random.sample(xrange(8000), 1000)

Ants Aasma 2010-04-05 11:42:12

Answer 4

+1 A:

Partial Fisher-Yates, as @Mark has suggested, with a little twist, storing the swaps along the way.
This way, it will at most consume as much memory as the result list O(m).
It will also run in O(m) - not O(n), as other solutions that enumerate the whole range - so it should not have problems on larger ranges.
This way, you can have the best of both worlds.

/// <summary>
/// Generates unique random numbers
/// <remarks>
/// Worst case memory usage is O(min((emax-imin)/2, num))
/// </remarks>
/// </summary>
/// <param name="random">Random source</param>
/// <param name="imin">Inclusive lower bound</param>
/// <param name="emax">Exclusive upper bound</param>
/// <param name="num">Number of integers to generate</param>
/// <returns>Sequence of unique random numbers</returns>
public static IEnumerable<int> UniqueRandoms(
    Random random, int imin, int emax, int num)
{
    int dictsize = num;
    long half = (emax - (long)imin + 1) / 2;
    if (half < dictsize)
        dictsize = (int)half;
    Dictionary<int, int> trans = new Dictionary<int, int>(dictsize);
    for (int i = 0; i < num; i++)
    {
        int current = imin + i;
        int r = random.Next(current, emax);
        int right;
        if (!trans.TryGetValue(r, out right))
        {
            right = r;
        }
        int left;
        if (trans.TryGetValue(current, out left))
        {
            trans.Remove(current);
        }
        else
        {
            left = current;
        }
        if (r > current)
        {
            trans[r] = left;
        }
        yield return right;
    }
}

andras 2010-04-04 22:53:30

Answer 5

A:

Sorted list with no sort, O(n)

If you want the integers sorted, I got to this answer in another question with a lot of help. You can do it using an exponential variate and thereby avoid any sort. As a result it is O(n):

From Alok's answer and Dan Dyer's comment it turns out that using an exponential distribution for a set of deltas gives a uniform distribution of integers in sequence.

So, you just start generating numbers and then scale them at the end. Adding 1 to the delta ensures you never repeat a value.

import random,sys,math

def genSortedInts(mini,maxi,vals):
    running = 0
    deltas = [random.expovariate(1.0) for i in range(0,vals+1)]
    floats = []
    for d in deltas:
        running += d
        floats.append(running)
    upper = floats.pop()
    valRange = maxi-mini-(vals-1)
    ints = [mini+int(f/upper*valRange)+id for id,f in enumerate(floats)]
    return ints

if __name__ == "__main__":
    vals = 10
    maxi = 80
    mini = 0
    print(genSortedInts(mini,maxi,vals))

Note the use of random.expovariate(1.0), a Python exponential distribution random number generator (very useful!). Here it's called with a mean of 1.0 (arg is 1/mean), but since the script normalises against the last number in the sequence, the mean itself doesn't matter.

Output (fair dice roll) for 10 values up to 80:

[3, 5, 10, 16, 25, 37, 41, 45, 57, 70]

Phil H 2010-04-07 14:13:30

ansaurus

tags:

views:

answers:

Algorithm to generate 1000 distinct integers in the range [0,8000]?

Sorted list with no sort, O(n)

related questions