tags:

views:

487

answers:

11

I understand that time(0) is commonly using for seeding random number generators and that it only becomes a problem when the program is being run more than once per second. I'm wondering what are some better seeds to consider when generating random numbers. I read about GetTickCount, timeGetTime, and QueryPerformanceCounter on Windows. Will these suffice for almost all operations or are there even better seeding options?

Here is a quick code example using the boost library:

#include <iostream>
#include <boost/random.hpp>
using namespace std;
using namespace boost;

int main()
{
    mt19937 randGen(42);
    uniform_int<> range(0,100);
    variate_generator<mt19937&, uniform_int<> > GetRand(randGen, range);

    for (int i = 0; i < 30; ++i)
        cout << GetRand() << endl;
}
+11  A: 

Some early hacks of Netscape security centered around knowing when an encrypted packet was sent and narrowing down the possible range of seeds with that knowledge. So, getting a tick count or something else even remotely deterministic is not your best bet.

Even using a seed, the sequence of "random" numbers is deterministic based on that seed. A Nevada Gaming Commission investigator realized this about certain slots he was supposed to inspect and used that knowledge to earn quite a bit of money before being caught.

If you need world-class randomness, you can add hardware to your system that provides for a highly randomized number. That's how the well-known poker sites do it (at least, that's what they say).

Short of that, combine a number of factors from your system that all change independently and rapidly, with as little predictability as possible, to create a very decent seed. An answer to a related post on SO suggested using Guid.NewGuid().GetHashCode(). Since a Guid is based on a number of deterministic factors including the time, that does not form a good basis for a seed:

Cryptanalysis of the WinAPI GUID generator shows that, since the sequence of V4 GUIDs is pseudo-random, given the initial state one can predict up to the next 250 000 GUIDs returned by the function UuidCreate[2]. This is why GUIDs should not be used in cryptography, e.g., as random keys.

Source: Wikipedia Globally Unique Identifier

Eric J.
+4  A: 

On unix systems, you could take a few bytes from /dev/random as a seed for your RNG. /dev/random is supposed to be very good random, using the different entropy sources available on a PC. Of course, this is completely implementation-dependent.

One case in which this could be useful is for cryptographic applications, since time(0) is relatively easy to guess.

static_rtti
+3  A: 

You will need an alternative/secondary source of entropy. Depending on how much entropy you want to use, you can calculate a hash of any of the following inputs and use it as a seed for your final generator:

  • declare an unintialized random size char array on the stack
  • allocate a random bytes of memory
  • ask the user to move the mouse
  • ask the user to put random CD in the CD drive and read random bytes at random location from the first track
  • open the user's microphone or camera, collect random number of seconds of input, calculate a hash and seed
  • Windows: use CryptGenRandom to get a buffer of cryptographically random bytes
  • Unix: as others mentioned, read from /dev/random
Franci Penov
the second two would probably be more trouble than they're worth :D
CrazyJugglerDrummer
Depends on how important is the seed randomness to @trikker :-)
Franci Penov
+4  A: 

On unix try reading from /dev/random. Reading from this device is slow so don't do it too often - eg only to set the initial seed. The random device gets data from hardware generated entropy (environmental noise from devices) and there's no endless amount of it available for a given time period. If you run out of entropy, SSL libraries may fail. Entropy refills after some time (actually it's a pool of entropy). There's also urandom afaik which is more economic but less random and won't block in low-of-entropy conditions.

hurikhan77
+1  A: 

Using tickCout() or anything with a high frequency is a bad idea. This is becuase the couter cycles back to zero very quickly thus gives the posability of having the same seed.

time(NULL):   Repeats every 64 years.  
tickCouter()  Repeats every X days.

You could try and get some random value from nature.
Lightining strikes world wide in the last second (appatently that is online)? (You may need to do research to see if that is variable though).

Martin York
Lightning strikes in the last second is like hyper-encryption without the hyper. If the attacker knows your source of randomness, and the approximate time you generated your seed, then since he has access to the same data you do, you're back where you were using `time(0)`. If the attacker doesn't know your source of randomness then it's some improvement, but is that the kind of implementation detail you're confident you can keep secret? What if someone spots what website you're connecting to?
Steve Jessop
Since the tick counter cycles very quickly, and the problem with time(0) is that it cycles way too slowly, the obvious solution is to seed with both. If your RNG seed is limited to 16 or 32 bits you've got troubles. In that case, seed your RNG with the tick counter, and save a few bits from that. Reseed with time(0) and discard a number of initial values, or XOR all subsequent results using the initially saved bits.
MSalters
Voting this up just because the response was out of the box.
trikker
+5  A: 

Too long for a comment but interesting story about 32bit seeds in the early days of online poker

The shuffling algorithm used in the ASF software always starts with an ordered deck of cards, and then generates a sequence of random numbers used to reorder the deck. In a real deck of cards, there are 52! (~2^226) possible unique shuffles. Recall that the seed for a 32-bit random number generator must be a 32-bit number, meaning that there are just over 4 billion possible seeds. Since the deck is reinitialized and the generator reseeded before each shuffle, only 4 billion possible shuffles can result from this algorithm. 4B possible shuffles is alarmingly less than 52!.

The RST-developed tool to exploit this vulnerability requires five cards from the deck to be known. Based on the five known cards, the program searches through the few hundred thousand possible shuffles and deduces which one is a perfect match. In the case of Texas Hold 'em Poker, this means the program takes as input the two cards that the cheating player is dealt, plus the first three community cards that are dealt face up (the flop). These five cards are known after the first of four rounds of betting, and are enough to determine (in real time, during play) the exact shuffle.

http://www.ibm.com/developerworks/library/s-playing/

Dustin Getz
+2  A: 

There is a web service that offers free and paid "true" random bits generated from atmospheric noise: http://www.random.org/

Wired ran an article on two guys who used basically the noise from a webcam CCD chip to generate random numbers: http://www.wired.com/wired/archive/11.08/random.html

Joseph Kingry
+1  A: 

You can store random seed on program exit and load it on start, so you'll need to initialize your RNG with time(0) only on first program start.

n0rd
Do you mean store the current value of the rng on exit and use that as seed on the next run?
Patrick
A: 

The method with random number generators is to only seed it once so your example of an online game is not a problem as, potentially, the same rng will be used for each value which would have been seeded when the program was first started (perhaps several years ago).

Similarly in your own code try to seed the rng once and then use the same instance where ever required rather than creating a new rng with a new seed all over the place.

Patrick
Patrick, using a PRNG to seed other PRNGs has pretty serious consequences and should be avoided (unless you absolutely know what you're doing—there are ways to make this work, none of that are naïve or easy).
Joey
What consequences? We're not talking about cryptographically secure rngs here...
Patrick
It also ruins them for simulations and other non-crypto tasks. It's one of the reasons why using PRNGs properly in parallel and distributed simulations is such a pain.
Joey
Interesting, I wasn't aware of that. I don't suppose you have a link easily to hand that explains this further?
Patrick
A: 

Using (only) the time as PRNG seed has basically two problems:

  1. It's predictable (which makes it unsuitable for crypto)
  2. Consecutive seeds have pretty much linear dependency

For the first problem, it's usually imperative that you take as many sources of entropy you can get your hands on.

As for the second problem, the paper Common defects in initialization of pseudorandom number generators by Makoto Matsumoto might give some insight.

Joey
A: 

Since you're already using boost, you probably want boost::random_device.

(At least on Linux. I don't recall whether the obvious CryptGenRandom implementation of it is yet available on Windows.)

me22