views:

1384

answers:

10

Is there a difference between generating multiple numbers using a single random number generator (RNG) versus generating one number per generator and discarding it? Do both implementations generate numbers which are equally random? Is there a difference between the normal RNGs and the secure RNGs for this?

I have a web application that is supposed to generate a list of random numbers on behalf of clients. That is, the numbers should appear to be random from each client's point of view. Does this mean I need retain a separate random RNG per client session? Or can I share a single RNG across all sessions? Or can I create and discard a RNG on a per-request basis?

UPDATE: This question is related to http://stackoverflow.com/questions/471157/is-a-subset-of-a-random-sequence-also-random

A: 

Well, as long as they are seeded differently each time they're created, then no, I don't think there'd be any difference; however, if it depended on something like the time, then they'd probably be non-uniform, due to the biased seed.

TraumaPony
how would you ensure a random seed then? have another random number generator generate the seed?
Jimmy
No; use something like the system time mixed with the ip of something.
TraumaPony
+10  A: 

I would recommend using a single generator multiple times. As far as I know, all the generators have a state. When you seed a generator, you set its state to something based on the seed. If you keep spawning new ones, it's likely that the seeds you pick will not be as random as the numbers generated by using just one generator.

This is especially true with most generators I've used, which use the current time in milliseconds as a seed.

Claudiu
If you share a single RNG, is each client still guaranteed to get random numbers? I believe that a RNG is guaranteed to return random numbers with respect to a single user, but who is to say this remains true when it's split in this manner?
Gili
I think if it's thread-safe, or you put a lock around it, then it should remain true. This will guarantee only sequential access to the generator, so it should work the same as if one client is using it.
Claudiu
Is a subset of a random sequence still random if you have no idea how the selection/division takes place?
Gili
+3  A: 

If you create a RNG and generate a single random number from it then discard the RNG, the number generated is only as random as the seed used to start the RNG.

It would be much better to create a single RNG and draw many numbers from it.

BoltBait
A: 

It's generally better to create a single PRNG and pull multiple values from it. Creating multiple instances means you need to ensure that the seeds for the instances are guaranteed unique, which will require incorporating instance-specific information.

As an aside, there are better "true" Random Number Generators, but they usually require specialized hardware which does things like derive random data from electrical signal variance inside the computer. Unless you're really worried about it, I'd say the Pseudo Random Number Generators built into the language libraries and/or OS are probably sufficient, as long as your seed value is not easily predictable.

Nick
+1  A: 

As people have already said, it's much better to seed the PRNG once, and reuse it. A secure PRNG is simply one which is suitable for cryptographic applications. The only way re-seeding each time will give reasonably random results is where it comes from a genuinely random "real world" source - ie specialised hardware. Even then, it's possible that the source is biased and it will still be theoretically better to use the same PRNG over.

Draemon
+1  A: 

Normally seeding a new state takes quite while for a serious PRNG, and making new ones each time won't really help much. The only case I can think of where you might want more than one PRNG is for different systems, say in a casino game you have one generator for shuffling cards and a separate one to generate comments done by the computer control characters, this way REALLY dedicated users can't guess outcomes based on character behaviors.

A nice solution for seeding is to use this (Random.org) , they supply random numbers generated from the atmospheric noise for free. It could be a better source for seeding than using time.

Edit: In your case, I would definitely use one PRNG per client, if for no other reason than for good programming standards. Anyways if you share one PRNG among clients, you will still be providing pseudo-random values to each, of a quality equal to your PRNG's quality. So that's a viable option but seems like a bad policy for programming

Robert Gould
Just be careful about security when you are using something like random.org. Do you trust them? If your app is a game, do you care if users "cheat" by adding a fake random.org to their hosts file? Etc. Of course, there are lots of other issues when you need secure random numbers, too.
Doug McClean
+12  A: 

A random number generator has a state -- that's actually a necessary feature. The next "random" number is a function of the previous number and the seed/state. The purists call them pseudo-random number generators. The numbers will pass statistical tests for randomness, but aren't -- actually -- random.

The sequence of random values is finite and does repeat.

Think of a random number generator as shuffling a collection of numbers and then dealing them out in a random order. The seed is used to "shuffle" the numbers. Once the seed is set, the sequence of numbers is fixed and very hard to predict. Some seeds will repeat sooner than others.

Most generators have period that is long enough that no one will notice it repeating. A 48-bit random number generator will produce several hundred billion random numbers before it repeats -- with (AFAIK) any 32-bit seed value.

A generator will only generate random-like values when you give it a single seed and let it spew values. If you change seeds, then numbers generated with the new seed value may not appear random when compared with values generated by the previous seed -- all bets are off when you change seeds. So don't.

A sound approach is to have one generator and "deal" the numbers around to your various clients. Don't mess with creating and discarding generators. Don't mess with changing seeds.

Above all, never try to write your own random number generator. The built-in generators in most language libraries are really good. Especially modern ones that use more than 32 bits.

Some Linux distros have a /dev/random and /dev/urandom device. You can read these once to seed your application's random number generator. These have more-or-less random values, but they work by "gathering noise" from random system events. Use them sparingly so there are lots of random events between uses.

S.Lott
Any linux distro ought to have those, and if they really don't you can probably mknod them. It would be possible to build a kernel without them, but why would you?
Mark Baker
I'd disagree that most language libraries have really good generators. Most implementations of C's rand() are just linear congruential generators with only an int's worth of state. There ARE good library implementations out there, but I wouldn't assume the one bundled with your compiler is quality. Other than that, good answer.
Adrian McCarthy
@Adrian McCarthy: I'm not sure I'd say "most" are poor. Almost all Linuxen have the rand48 functions which are 48 bits. Check your man pages to confirm that you have these and use them. They're quite random and also seem to be standard.
S.Lott
I does not have to have state. You can write e.g. a recursive solution, which of course is not that performant.Also, your random-type itself could be state free, leveraging the state to the client, as in: MyRand rng; for (...) { print (rng.current()); rng = rng.proceed(); }With proceed() returning a new instance of MyRand. So it is not, as in your first phrase, a necessary feature of the RNG.
phresnel
@phresnel: Your implementations do not remove the state of the random number generator: they just implement it differently. A recursion maintains state in the stack. Assigning the seed to the client doesn't remove state, it just implements it elsewhere. Any RNG with a seed value is stateful; all algorithms use a seed value.
S.Lott
@S.Lott: exactly: in the example, the state was leveraged to the client, making the *RNG* stateless (not the client), as in, having no mutable members *within* the RNG, a "pure" functor. The other solution, the recursive one: Yes, there is state, but immutable (can be done at compile time in C++, e.g.). Care to give an example of an algorithm that has no state at all? I.e., one that does not depend on global entities and which does not have parameters? The only functions of that class that come into mind are those of the form "T fun() { return X; }", with X being a compile time constant.
phresnel
@phresnel: To do anything useful, an RNG cannot be idempotent. When a function is not idempotent, that's because it has state; either internally or pushed into the client as an opaque parameter. When a function is idempotent (square root for example) then there is no state saved anywhere. A random number generator cannot be idempotent and must have state. Otherwise it's largely useless as a "generator". If you simply want to implement the LCM part statelessly, then it isn't much of a "generator" is it?
S.Lott
I think the point of confusion was that we both started by assuming a different kind of state-freenes; in C++, we often use "statefree" synonymous with "no mutable state" or "free of state changes once initialized", often seen in writing const correct code, which is said to reduce bug-proneness. You could also implement a tail recursive solution that does not need the stack, but yes, it still needs knowledge of it predecessor, and if the value shall be returned to the user (as in "useful"), the user must somehow preserve, even if immutable, state, eradicating the whole point :)
phresnel
A: 

It's worth mentioning that Haskell is a language which attempts to entirely eliminate mutable state. In order to reconcile this goal with hard-requirements like IO (which requires some form of mutability), monads must be used to thread state from one calculation to the next. In this way, Haskell implements its pseudo-random number generator. Strictly speaking, generating random numbers is an inherently stateful operation, but Haskell is able to hide this fact by moving the state "mutation" into the bind (>>=) operation.

This probably sounds a little abstract, and it doesn't really answer your question completely, but I think it is still applicable. From a theoretical standpoint, it is impossible to work with a RNG without involving state. Regardless, there are techniques which can be used to mitigate this interaction and make it appear as if the entire operation is of a stateless nature.

Daniel Spiewak
"make it *appear* as if the entire operation is of a stateless nature."That is only appearance, though. What "stateless" typically means is that you put your state on the stack, and only let the currently active function look at the top-most stack frame. The bind operation really just means "replace this stack frame with that other stack frame where this one field has this new one value".It's still state. Not to say that Haskell advocates (like me :D) are wrong: it does make your programs easier to reason about, because you explicitly state the scope of your statefulness.PRNG => state
Jonas Kölker
+5  A: 

Hardware-based, true [1], random number generators are possible, but non-trivial and often have low mean rates. Availablity can also be an issue [2]. Googling for "shot noise" or "radioactive decay" in combination with "random number generator" should return some hits.

These systems do not need to maintain state. Probably not what you were looking for.

As noted by others, software systems are only pseudo-random, and must maintain state.

A compromise is to use a hardware based RNG to provide an entropy pool (stored state) which is made available to seed a PRNG. This is done quite explicitly in the linux implementation of /dev/random [3] and /dev/urandom [4].

These is some argument about just how random the default inputs to the /dev/random entropy pool really are.


Footnotes:

  1. modulo any problems with our understanding of physics
  2. because you're waiting for a random process
  3. /dev/random features direct access to the entropy pool seeded from various sources believed to be really or nearly random, and blocks when the entropy is exhausted
  4. /dev/urandom is like /dev/random, but when the entopy is exhausted a cryptographic hash is employed which makes the entropy pool effectively a stateful PRNG
dmckee
A: 

The use of a secure PRNG depends on your application. What are the random numbers used for? If they're something of real value (e.g. anything cryptographically related), you wouldn't want to use anything less.

Secure PRNGs are much slower, and may require libraries to do operation of arbitrary precision, and primality testing, etc etc...

Calyth