views:

489

answers:

10

I am looking for an efficient way to generate numbers that a human would perceive as being random. Basically, I think of this as avoiding long sequences of 0 or 1 bits. I expect humans to be viewing the bit pattern, and a very low powered cpu should be able to calculate near a thousand of these per second.

There are two different concepts that I can think of to do this, but I am lost finding a efficient way of accomplishing them.

  1. Generate a random number with a fixed number of one bits. For a 32-bit random number, this requires up to 31 random numbers, using the Knuth selection algorithm. is there a more efficient way to generate a random number with some number of bits set? Unfortunately, 0000FFFF doesn't look very random.

  2. Some form of "part-wise' density seems like it'd look better - but I can't come up with a clear way of doing so - I'd imagine going through each chunk, and calculate how far it is from the ideal density, and try to increase the bit density of the next chunk. This sounds complex.

Hopefully there's another algorithm that I haven't thought about for this. Thanks in advance for your help.

[EDIT] I should be clearer with what I ask -
(a) Is there an efficient way to generate random numbers without "long" runs of a single bit, where "long" is a tunable parameter?
(b) Other suggestions on what would make a number appear to be less-random?

+5  A: 

A linear feedback shift register probably does what you want.

Edit in light of an updated question: You should look at a shuffle bag, although I'm not sure how fast this could run. See also this question.

David Johnstone
LFSR's seem to generate low-bit density results ( eg, 00..0001 is a valid output from the LFSR?
Exactly what I was going to suggest. You can probably implement it really simply in assembly for some extra fast prn generation.
Joel Potter
@James: When the proper feedbacks bits are chosen an lfsr will cycle through all possible values in the given bit size (except 0). Which numbers come first is entirely dependent on your seed.
Joel Potter
I suspect I need something in addition to the random source to skew the distribution towards short sequences of bits.
+2  A: 
1800 INFORMATION
Bit patterns that "look random" are ones which (at least) minimize long sequences of 0's or 1's. 4 fails for the long sequences of 0s on each side.
A: 

You need to decide by exactly what rules you decide if something "looks random". Then you take a random number generator that produces enough "real randomness" for your purpose, and every time it generates a number that doesn't look random enough, you throw that number away and generate a new one.

Or you directly produce a sequence of "random" bits and every time the random generator outputs the "wrong" next bit (that would make it look not-random), you just flip that bit.

sth
Half of what I was looking for is insight on what other metrics to use for "perceived randomness". At the least, the avoidance of long sequences of 0's or 1's is a must.It seems that examining the generated number is likely to be even less effective than a random number per bit position.
A: 

This is how I would examine the number:

const int max_repeated_bits = 4;  /* or any other number that you prefer */

int examine_1(unsigned int x) {      
  for (int i=0; i<max_repeated_bits; ++i) x &= (x << 1);
  return x == 0;
}

int examine(unsigned int x) {
  return examine_1(x) && examine_1(~x);
}

Then, just generate a number x, if examine(x) return 0, reject it and try again. The probability to get a 32-bit number with more than 4 bits in a row is about 2/3, so you would need about 3 random generator callse per number. However, If you allow more than 4 bits, it gets better. Say, the probability to get more than 6 bits in a row only about 20%, so you would need only 1.25 calls per number.

Igor Krivokon
A: 

Here's what I'd do. I'd use a number like 00101011100101100110100101100101 and rotate it by some random amount each time.

But are you sure that a typical pseudo random generator wouldn't do? Have you tried it? You con't very many long strings of 0s and 1s anyhow.

If you're going to use a library random number and you're worried about too many or too few bits being set, there are cheap ways of counting bits.

Nosredna
+1  A: 

Random numbers often have long sequences of 1s and 0s, so I'm not sure I fully understand why you can't use a simple linear congruential generator and shift in or out how ever many bits you need. They're blazing fast, look extremely random to the naked eye, and you can choose coefficients that will yield random integers in whatever positive range you need. If you need 32 "random looking" bits, just generate four random numbers and take the low 8 bits from each.

You don't really need to implement your own at all though, since in most languages the random library already implements one.

If you're determined that you want a particular density of 1s, though, you could always start with a number that has the required number of 1s set

int a = 0x00FF;

then use a bit twiddling hack to implement a bit-level shuffle of the bits in that number.

Bill the Lizard
A: 

If you are looking to avoid long runs, how about something simple like:

#include <cstdlib>

class generator {
public:
   generator() : last_num(0), run_count(1) { }

   bool next_bit() {
      const bool flip = rand() > RAND_MAX / pow( 2, run_count);
                               // RAND_MAX >> run_count ? 
      if(flip) {
         run_count = 1;
         last_num = !last_num;
      } else
         ++run_count;

      return last_num;
   }
private:
   bool last_num;
   int run_count;
};

Runs become less likely the longer they go on. You could also do RAND_MAX / 1+run_count if you wanted longer runs

Todd Gardner
+2  A: 

Since you care most about run length, you could generate random run lengths instead of random bits, so as to give them the exact distribution you want.

The mean run length in random binary data is of course 4 (sum of n/(2^(n-1))), and the mode average 1. Here are some random bits (I swear this is a single run, I didn't pick a value to make my point):

0111111011111110110001000101111001100000000111001010101101001000

See there's a run length of 8 in there. This is not especially surprising, since run length 8 should occur roughly every 256 bits and I've generated 64 bits.

If this doesn't "look random" to you because of excessive run lengths, then generate run lengths with whatever distribution you want. In pseudocode:

loop
    get a random number
    output that many 1 bits
    get a random number
    output that many 0 bits
endloop

You'd probably want to discard some initial data from the stream, or randomise the first bit, to avoid the problem that as it stands, the first bit is always 1. The probability of the Nth bit being 1 depends on how you "get a random number", but for anything that achieves "shortish but not too short" run lengths it will soon be as close to 50% as makes no difference.

For instance "get a random number" might do this:

get a uniformly-distributed random number n from 1 to 81
if n is between 1 and 54, return 1
if n is between 55 and 72, return 2
if n is between 72 and 78, return 3
if n is between 79 and 80, return 4
return 5

The idea is that the probability of a run of length N is one third the probability of a run of length N-1, instead of one half. This will give much shorter average run lengths, and a longest run of 5, and would therefore "look more random" to you. Of course it would not "look random" to anyone used to dealing with sequences of coin tosses, because they'd think the runs were too short. You'd also be able to tell very easily with statistical tests that the value of digit N is correlated with the value of digit N-1.

This code uses at least log(81) = 6.34 "random bits" to generate on average 1.44 bits of output, so is slower than just generating uniformly-distributed bits. But it shouldn't be much more than about 7/1.44 = 5 times slower, and a LFSR is pretty fast to start with.

Steve Jessop
A: 

There are various variants of linear feedback shift registers, such as shrinking and self-shrinking which modify the output of one LFSR based on the output of another.

The design of these attempts to create random numbers, where the probability of getting two bits the same in a row is 0.5, of getting three in a row is 0.25 as so on.

It should be possible to chain two LFSRs to inhibit or invert the output when a sequence of similar bits occurs - the first LFSR uses a conventional primitive polynomial, and the feed the output of the first into the second. The second shift register is shorter, doesn't have a primitive polynomial. Instead it is tapped to invert the output if all its bits are the same, so no run can exceed the size of the second shift register.

Obviously this destroys the randomness of the output - if you have N bits in a row, the next bit is completely predictable. Messing around with using the output of another random source to determine whether or not to invert the output would defeat the second shift register - you wouldn't be able to detect the difference between that and just one random source.

Pete Kirkham
A: 

Check out the GSL. I believe it has some functions that do just what you want. They at least are guaranteed to be random bit strings. I'm not sure if they would LOOK random, since thats more of a psychological question.

Flamewires