views:

554

answers:

4

I'm need a pseudo-random generator which takes a number as input and returns another number witch is reproducible and seems to be random.

  • Each input number should match to exactly one output number and vice versa
  • same input numbers always result in same output numbers
  • sequential input numbers that are close together (eg. 1 and 2) should produce completely different output numbers (eg. 1 => 9783526, 2 => 283)

It must not be perfect, it's just to create random but reproducible test data.

I use C#.


I wrote this funny piece of code some time ago which produced something random.

  public static long Scramble(long number, long max) 
  {
    // some random values 
    long[] scramblers = { 3, 5, 7, 31, 343, 2348, 89897 };
    number += (max / 7) + 6;
    number %= max;
    // shuffle according to divisibility
    foreach (long scrambler in scramblers) 
    {
      if (scrambler >= max / 3) break;
      number = ((number * scrambler) % max) 
        + ((number * scrambler) / max);
    }

    return number % max;
  }

I would like to have something better, more reliable, working with any size of number (no max argument).

Could this probably be solved using a CRC algorithm? Or some bit shuffling stuff.

+2  A: 

You (maybe) can do this easily in C# using the Random class:

public int GetPseudoRandomNumber(int input)
{
    Random random = new Random(input);
    return random.Next();
}

Since you're explicitly seeding Random with the input, you will get the same output every time given the same input value.

MusiGenesis
But his first requirement is "Each input number should match to exactly one output number and vice versa" - the vice versa will not work with your solution.
tanascius
@tanascius: actually, I'm not sure this *wouldn't* match this requirement, but I don't know how Random works under the hood, so you might be right.
MusiGenesis
Well, I tested for inputs 1 to 10mio ... you never get the same output twice ... so it might work indeed. But it is still dependant on the implementation and could change in the future.
tanascius
I have to admit that the "match - and vice versa" part is not a too strong requirement, since I don't need all the numbers of the input value. It just should not produce the same numbers over and over again while never producing others.
Stefan Steinegger
@tanascius: quick test, thanks. I suspect MS is not likely to change the implementation any time soon. After all, "randomness is too important to be left to chance". :)
MusiGenesis
I'm not sure that the two requirements are even compatible. A one-to-one mapping is easy to achieve by techniques such as bit-to-bit mapping, but this will not yield good avalanching. Good avalanching is relatively easy, but requires so much bit mixing that I'm not sure the one-to-one requirement can be guaranteed. If I'm wrong about this, I'd be thrilled to know.
Steven Sudit
@Stefan: If all you want is good distribution, take phoku's advice and hash the seed. This can't guarantee one-to-one, but it will definitely avoid obvious repetitions, assuming you use a decent hash.
Steven Sudit
For a pseudo random number generator, the same seed will always produce the same "random" number. That's why you need to seed them with something random (like the current time).
sbi
@sbi: given the question and my answer, I'm pretty sure you're not suggesting that I wasn't aware of that fact. :)
MusiGenesis
A: 

This looks like a dupe of this SO post - Pseudo Random Generator with same output.

48klocs
+2  A: 

I remove the microsoft code from this answer, the GNU code file is a lot longer but basically it contains this from http://cs.uccs.edu/~cs591/bufferOverflow/glibc-2.2.4/stdlib/random%5Fr.c :

int32_t val = state[0];
val = ((state[0] * 1103515245) + 12345) & 0x7fffffff;
state[0] = val;
*result = val;

for your purpose, the seed is state[0] so it would look more like

int getRand(int val)
{
    return ((val * 1103515245) + 12345) & 0x7fffffff;
}
John Boker
Um, that copyright notice somewhat pokes into my eye. Do you think you are allowed to simply paste that code here?
sbi
well, I'm not a lawyer but it's just a snippet from the file, also it says copyright 1985 - 2001 (it's 2009 now). It's also available online in other places and i did leave the copyright text in the file. If someone has a problem with this i can remove the answer and replace it with the GNU one.
John Boker
This is Steve Ballmer - your ass is mine, boy!
MusiGenesis
@MusiGenesis as steve balmer i can just send you all my issues with windows and you'll get them fixed ?
John Boker
@John: yes, after I crush you like a grape! Bwahahahaha!
MusiGenesis
+1  A: 

The first two rules suggest a fixed or input-seeded permutation of the input, but the third rule requires a further transform.

Is there any further restriction on what the outputs should be, to guide that transform? - e.g. is there an input set of output values to choose from?

If the only guide is "no max", I'd use the following...

  1. Apply a hash algorithm to the whole input to get the first output item. A CRC might work, but for more "random" results, use a crypto hash algorithm such as MD5.

  2. Use a next permutation algorithm (plenty of links on Google) on the input.

  3. Repeat the hash-then-next-permutation until all required outputs are found.

The next permutation may be overkill though, you could probably just increment the first input (and maybe, on overflow, increment the second and so on) before redoing the hash.

For crypto-style hashing, you'll need a key - just derive something from the input before you start.

Steve314