views:

2802

answers:

9

The problem is this: I'd like to generate unique random numbers between 0 and 1000 that never repeat (I.E. 6 doesn't come out twice), but that doesn't resort to something like an O(N) search of previous values to do it. Is this possible?

+44  A: 

You can do this:

  1. Create a list, 0..1000.
  2. Shuffle the list. (See Fisher-Yates shuffle for a good way to do this.)
  3. Return numbers in order from the shuffled list.

So this doesn't require a search of old values each time, but it still requires O(N) for the initial shuffle. But as Nils pointed out in comments, this is amortized O(1).

Chris Jester-Young
If you use your shuffled list for the first 1000 queries and then re-shuffle you have amortized O(1).
Nils Pipenbrinck
Exactly! I'll update the entry to say this.
Chris Jester-Young
I disagree that it's amortized. The total algorithm is O(N) because of the shuffling. I guess you could say it's O(.001N) because each value only consumes 1/1000th of a O(N) shuffle, but that's still O(N) (albeit with a tiny coefficient).
Just Some Guy
@Just Some GuyN = 1000, so you are saying that it is O(N/N) which is O(1)
Guvante
If each insert into the shuffled array is an operation, then after inserting 1 value, you can get 1 random value. 2 for 2 values, and so on, n for n values. It takes n operations to generate the list, so the entire algorithm is O(n). If you need 1,000,000 random values, it will take 1,000,000 ops
Kibbee
Think about it this way, if it was constant time, it would take the same amount of time for 10 random numbers as it would for 10 billion. But due to the shuffling taking O(n), we know this is not true.
Kibbee
Any method which requires you to initialise an array of size N with values is going to have an O(N) initialisation cost; moving the shuffle to the initialise or doing one iteration of the shuffle per request doesn't make any difference.
Pete Kirkham
"It's O(1)... N times!"
Robert Fraser
This actually takes amortized time O(log n), since you need to generate n lg n random bits.
Charles
+1  A: 

Another posibility:

You can use an array of flags. And take the next one when it;s already chosen.

But, beware after 1000 calls, the function will never end so you must make a safeguard.

Gamecat
+58  A: 

Initialize an array of 1001 integers with the values 0-1000 and set a variable, max, to the current max index of the array (starting with 1000). Pick a random number, r, between 0 and max, swap the number at the position r with the number at position max and return the number now at position max. Decrement max by 1 and continue. When max is 0, set max back to the size of the array - 1 and start again without the need to reinitialize the array.

Update: Although I came up with this method on my own when I answered the question, after some research I realize this is a modified version of Fisher-Yates known as Durstenfeld-Fisher-Yates or Knuth-Fisher-Yates. Since the description may be a little difficult to follow, I have provided an example below (using 11 elements instead of 1001):

Array starts off with 11 elements initialized to array[n] = n, max starts off at 10:

+--+--+--+--+--+--+--+--+--+--+--+
| 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|
+--+--+--+--+--+--+--+--+--+--+--+
                                ^
                               max

At each iteration, a random number r is selected between 0 and max, array[r] and array[max] are swapped, the new array[max] is returned, and max is decremented:

max = 10, r = 3
           +--------------------+
           v                    v
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 1| 2|10| 4| 5| 6| 7| 8| 9| 3|
+--+--+--+--+--+--+--+--+--+--+--+

max = 9, r = 7
                       +-----+
                       v     v
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 1| 2|10| 4| 5| 6| 9| 8| 7: 3|
+--+--+--+--+--+--+--+--+--+--+--+

max = 8, r = 1
     +--------------------+
     v                    v
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 8| 2|10| 4| 5| 6| 9| 1: 7| 3|
+--+--+--+--+--+--+--+--+--+--+--+

max = 7, r = 5
                 +-----+
                 v     v
+--+--+--+--+--+--+--+--+--+--+--+
| 0| 8| 2|10| 4| 9| 6| 5: 1| 7| 3|
+--+--+--+--+--+--+--+--+--+--+--+

...

After 11 iterations, all numbers in the array have been selected, max == 0, and the array elements are shuffled:

+--+--+--+--+--+--+--+--+--+--+--+
| 4|10| 8| 6| 2| 0| 9| 5| 1| 7| 3|
+--+--+--+--+--+--+--+--+--+--+--+

At this point, max can be reset to 10 and the process can continue.

Robert Gamble
Awesome! I knew there was a way... :)
dicroce
If the array doesn't start off shuffled (and not with the naive algorithm; see Jeff's post on shuffling), doesn't this introduce a subtle bias?
Mitch Wheat
No bias. This is just a spread-out version of the same algorithm.
recursive
@lk: Only Jon Skeet qualifies for "godlike"; this is just "excellently good".
Brent.Longborough
Jeff's post on shuffling suggests this will not return good random numbers..http://www.codinghorror.com/blog/archives/001015.html
pro
@Peter Rounce: I think not; this looks to me like the Fisher Yates algorithm, also quoted in Jeff's post (as the good guy).
Brent.Longborough
Sorry, Knuth-Fisher-Yates
Brent.Longborough
This answer is why stackoverflow is great!
dicroce
Perl6/Parrot will have lazy lists, that will only keep track of changed elements. Which means they won't have the initial hit that other systems would have.
Brad Gilbert
@Brad: I'm not really sure what you are talking about. What "initial hit" are you referring to and how will Perl6/Parrot avoid that exactly?
Robert Gamble
"I came up with this method on my own when I answered the question, after some research I realize this is a modified version of Fisher-Yates known as Durstenfeld-Fisher-Yates or Knuth-Fisher-Yates" Oh SNAP
Robert Fraser
The algorithm you describe takes O(n log n) time to generate the list, or O(log n) amortized time per element.
Charles
@Charles: How do you come up with that?
Robert Gamble
@Robert: Because you're generating a (lg n)-bit random number for each number on the list, and this takes Omega(log n) time.
Charles
@Charles: I think your logic is faulty, it may take O(n) time to generate the original list of numbers if you do not already have this set up but it takes O(1) time to generate each number from the list which is that was being asked for.
Robert Gamble
@Robert: This method does not generate the list in O(n), it takes O(n log n). I can't imagine that the request meant that we're allowed infinite pre-processing time, because in that case any method works -- just do whatever you need to do, then generate each item and store it in a table. It's clear to me that the time must include the upfront cost, which is log n per element. (Actually, the algorithm can be modified to generate each element in O(log n) time rather than O(log n) amortized time.)
Charles
@Charles: The initial list takes O(n) time to initialize (it takes twice as long to initialize 2 integers of the same size as it does initialize one such integer). The time that the random number generation takes depends on the implementation. Once you have the random number it takes constant time to execute the algorithm I presented. The OP asked for something "that doesn't resort to something like an O(N) search of previous values to do it" which indicates that the O(1) part should be the selection. Since this is the accepted answer, I think it met the OP's requirements.
Robert Gamble
@Robert: The method requires Theta(n log n) random bits: actually, precisely lg n! bits. Are you claiming that you can generate Theta(n log n) bits in time O(n)?
Charles
@Charles, the claim is what I put in my answer. Read the question and my answer, if you feel you have a better answer you are more than free to post it.
Robert Gamble
@robert: I just wanted to point out that it doesn't produce, as in the name of the question, "unique random numbers in O(1)".
Charles
@Charles: Fair enough, although nobody ever said it did and the the question elaborates pretty well on what was desired which should prevent confusion although I guess the title could be updated to better reflect the question that was actually asked and answered.
Robert Gamble
@Robert: Yep. (In fairness, I didn't claim that anyone claimed this was O(1) -- I just pointed out the complexity, since it hadn't been mentioned up to that point.)
Charles
@Charles/Robert - assuming you are dealing with fixed size 32-bit machine integers (which is a reasonable assumption given the question and scales of integers discussed) then you only need at most 32 random bits per sample. Generating that is an O(1) operation, hence this should be regarded as an O(1) algorithm for practical (as opposed to theoretical) purposes.
mikera
@mikera: Agreed, although technically if you're using fixed-size integers the whole list can be generated in O(1) (with a large constant, viz. 2^32). Also, for practical purposes, the definition of "random" is important -- if you really want to use your system's entropy pool, the limit is the computation of the random bits rather than calculations themselves, and in that case n log n is relevant again. But in the likely case that you'll use (the equivalent of) /dev/urandom rather than /dev/random, you're back to 'practically' O(n).
Charles
+5  A: 

You could use A Linear Congruential Generator. Where m (the modulus) would be the nearest prime bigger than 1000. When you get a number out of the range, just get the next one. The sequence will only repeat once all elements have occurred, and you don't have to use a table. Be aware of the disadvantages of this generator though (including lack of randomness).

Paul de Vrieze
+17  A: 

Use a Maximal Linear Feedback Shift Register.

It's implementable in a few lines of C and at runtime does little more than a couple test/branches, a little addition and bit shifting. It's not random, but it fools most people.

plinth
I can't believe this isn't getting more upvotes.
Max
"It's not random, but it fools most people". That applies to all pseudo-random number generators and all feasible answers to this question. But most people won't think about it. So omitting this note would maybe result in more upvotes...
f3lix
+1 LFSRs are a very simple and effective solution for many applications.
Steve Melnikoff
+1 for using only O(1) memory.
starblue
+3  A: 
Max
That would fool fewer people than an LFSR.
starblue
"bitmask" within 512...1023 is OK, too. For a little more fake randomness see my answer. :-)
sellibitze
A: 

You could use a good pseudo-random number generator with 10 bits and throw away 1001 to 1023 leaving 0 to 1000.

From here we get the design for a 10 bit PRNG..

  • 10 bits, feedback polynomial x^10 + x^7 + 1 (period 1023)

  • use a Galois LFSR to get fast code

pro
A: 

For low numbers like 0...1000, creating a list that contains all the numbers and shuffling it is straight forward. But if the set of numbers to draw from is very large there's another elegant way: You can build a pseudorandom permutation using a key and a cryptographic hash function. See the following C++-ish example pseudo code:

unsigned randperm(string key, unsigned bits, unsigned index) {
  unsigned half1 =  bits    / 2;
  unsigned half2 = (bits+1) / 2;
  unsigned mask1 = (1 << half1) - 1;
  unsigned mask2 = (1 << half2) - 1;
  for (int round=0; round<5; ++round) {
    unsigned temp = (index >> half1);
    temp = (temp << 4) + round;
    index ^= hash( key + "/" + int2str(temp) ) & mask1;
    index = ((index & mask2) << half1) | ((index >> half2) & mask1);
  }
  return index;
}

Here, hash is just some arbitrary pseudo random function that maps a character string to a possibly huge unsigned integer. The function randperm is a permutation of all numbers within 0...pow(2,bits)-1 assuming a fixed key. This follows from the construction because every step that changes the variable index is reversible. This is inspired by a Feistel cipher.

sellibitze
+1  A: 

Here's some code I typed up that uses the logic of the first solution. I know this is "language agnostic" but just wanted to present this as an example in C# in case anyone is looking for a quick practical solution.

    // Initialize variables
    Random RandomClass = new Random();
    int RandArrayNum;
    int MaxNumber = 10;
    int LastNumInArray;
    int PickedNumInArray;
    int[] OrderedArray = new int[MaxNumber];      // Ordered Array - set
    int[] ShuffledArray = new int[MaxNumber];     // Shuffled Array - not set

    // Populate the Ordered Array
    for (int i = 0; i < MaxNumber; i++)                  
    {
        OrderedArray[i] = i;
        listBox1.Items.Add(OrderedArray[i]);
    }

    // Execute the Shuffle                
    for (int i = MaxNumber - 1; i > 0; i--)
    {
        RandArrayNum = RandomClass.Next(i + 1);         // Save random #
        ShuffledArray[i] = OrderedArray[RandArrayNum];  // Populting the array in reverse
        LastNumInArray = OrderedArray[i];               // Save Last Number in Test array
        PickedNumInArray = OrderedArray[RandArrayNum];  // Save Picked Random #
        OrderedArray[i] = PickedNumInArray;             // The number is now moved to the back end
        OrderedArray[RandArrayNum] = LastNumInArray;    // The picked number is moved into position
    }

    for (int i = 0; i < MaxNumber; i++)                  
    {
        listBox2.Items.Add(ShuffledArray[i]);
    }
firedrawndagger