views:

247

answers:

2

I have a set of numbers ~100, I wish to perform MC simulation on this set, the basic idea is I fully randomize the set, do some comparison/checks on the first ~20 values, store the result and repeat.

Now the actual comparison/check algorithm is extremely fast it actually completes in about 50 CPU cycles. With this in mind, and in order to optimize these simulations I need to generate the random sets as fast as possible.

Currently I'm using a Multiply With Carry algorithm by George Marsaglia which provides me with a random integer in 17 CPU cycles, quite fast. However, using the Fisher-Yates shuffling algorithm I have to generate 100 random integers, ~1700 CPU cycles. This overshadows my comparison time by a long ways.

So my question is are there other well known/robust techniques for doing this type of MC simulation, where I can avoid the long random set generation time?

I thought about just randomly choosing 20 values from the set, but I would then have to do collision checks to ensure that 20 unique entries were chosen.

Update:

Thanks for the responses. I have another question with regards to a method I just came up with after my post. The question is, will this provide a robust truly (assuming the RNG is good) random output. Basically my method is to set up an array of integer values the same length as my input array, set every value to zero. Now I begin randomly choosing 20 values from the input set like so:

int pcfast[100];
memset(pcfast,0,sizeof(int)*100);
int nchosen = 0;
while (nchosen<20)
{
    int k = rand(100); //[0,100]
    if ( pcfast[k] == 0 )
    {
        pcfast[k] = 1;
        r[nchosen++] = s[k]; // r is the length 20 output, s the input set.
    }
}

Basically what I mentioned above, choosing 20 values at random, except it seems like a somewhat optimized way of ensuring no collisions. Will this provide good random output? Its quite fast.

+3  A: 

If you only use the first 20 values in the randomised array, then you only need to do 20 steps of the Fisher-Yates algorithm (Knuth's version). Then 20 values have been randomised (actually at the end of the array rather than at the beginning, in the usual formulation), in the sense that the remaining 80 steps of the algorithm are guaranteed not to move them. The other 80 positions aren't fully shuffled, but who cares?

C++ code (iterators should be random-access):

using std::swap;

template <typename Iterator, typename Rand> // you didn't specify the type
void partial_shuffle(Iterator first, Iterator middle, Iterator last, Rand rnd) {
    size_t n = last - first;
    while (first != middle) {
        size_t k = rnd(n);   // random integer from 0 to n-1
        swap(*(first+k),*first);
        --n;
        ++first;
    }
}

On return, the values from first through to middle-1 are shuffled. Use it like this:

int arr[100];
for (int i = 0; i < 100; ++i) arr[i] = i;
while (need_more_samples()) {
    partial_shuffle(arr, arr+20, arr+100, my_prng);
    process_sample(arr, arr+20);
}
Steve Jessop
Works very nicely, as expected it shaved off 80% of the generation time.
DeusAduro
+3  A: 
klochner
There's a reference to it as the "modern algorithm" on the Fisher-Yates wiki page:http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#The_modern_algorithm
klochner
Note that this can also be done in place if preferred, by swapping the values rather than doing the copy-to-output and copy-within-array. That leaves the 100 values ready to be shuffled again, whereas this version destroys the input array. Setting up the input again could be significant compared with the blinding speeds the OP is talking about. The result of making this change is my answer :-)
Steve Jessop
agreed on in-place being faster.
klochner
Indeed, I'm gonna give it a shot onebyone, its funny I spent so long optimizing the comparison code, LUTs, perfect minimal hashes and the like... then I got to the randomization, ahhh.
DeusAduro