views:

111

answers:

4

Hello, I am writing a numerical program that should be somewhat fast, and is also multithreaded. I have a class that represents a number, and I want to use in it a random number generator. Now, I don't really require my RNG to be a true RNG, I just need it to generate integers with even distribution between 0 and NMAX.

So I have this in my class:

// use just an int here and forget about multithreading.
static uint32 rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return rand = ( rand + 1 ) % NMAX; }

Now, in a single threaded world, this is totally fine for my purposes.

Since this is multithreaded, I'm assuming that the only possible bad thing that could happen is that occasionally (like <1% of the time) the update gets dropped. Meaning two threads read rand, update it in a register, return the updated value, and then write it, twice, with the same value. This is totally fine.

My question is could anything worse than that happen? I'm totally fine with each thread using its own rand variable but that's just a huge pain to make happen. What I definitely can't do is make it so that each instance of the class uses its own rand variable, as that would use way too much memory.

UPDATE:

So, why do I want to do this? Full story is its a floating point class that uses 1 or 2 bytes. So it has to be fast and such, and this just seems the best way. In fact I think I'll update it from ( rand + 1 ) % NMAX to something like ( rand + [some prime] ) % NMAX since it seems to work better. This is an example of one of those cases where a more robust solution would require more code, make things less generic and more dependency ridden, make code less clear and easier to break, and all for the idea that "proper synchronization should be used".

Mostly I'm worried of some weird optimization that the compiler might do, so that an update to rand is not just dropped, but rand becomes total garbage. Now that I think about it though, even that would be okay (the way this number is used), since the next use of GetRand would %NMAX it anyway, the error would only cause at most one use of GetRand to be outside the given range of [0,NMAX). Thanks for any answers.

+1  A: 
David
Windows OS also provides thread-local storage. http://msdn.microsoft.com/en-us/library/ms686801(VS.85).aspx
rwong
Ah, yes I could use TLS, but this also needs to be fast (see my pending edit above). I remember reading somewhere that TLS is slow, though I have no idea why it would be so. The problem you mention of non-shared cache would be fine, it achieves each thread having its own rand variable! :) and when rand gets finally updated, that's just increased randomness!
Scott
A: 

If two threads call GetRand at the same time, the classic unsynchronized error may happen. For example, rand = 10. After two threads call GetRand, rand is expected to be 12, but actually it may be 11. If this OK for you, you can leave this code without changes. But I think it is better to use synchronization, because without it both the code and its result look a bit strange. Another programming can think that is is bug.

Edit.

rand = ( rand + 1 ) % NMAX;

Worst case: two or more threads read the same rand variable rand from the memory. Each thread locally makes calculation ( rand + 1 ) % NMAX. Then all threads put the same result back to the memory. This doesn't corrupt the rand variable value, this doesn't cause this variable to be out of scope and number generator will continue to compute valid numbers.

Alex Farber
+1  A: 

For the purpose of discussion let's assume the following implementation:

  • The Mersenne Twister (mt19937), which generates batches of 624 random numbers per call, is used.
  • Each instance of your class (which is used solely inside a single thread) reads off a number from the batch and increments the global index counter so the next call (from any instance) will fetch the next number in the batch. When the global index reaches the end of array, the RNG will be locked and a new batch of 624 random numbers will be generated, after which the global index is reset.

My suggestion for improvement is for each instance to fetch, say, 16 numbers at a time. The 16 numbers do not have to be stored (copied) inside the instance: you simply increment the global index by 16 (making them unavailable to other instances), so that the instance can consume them one by one.

rwong
+1  A: 
  1. You may use TLS so that each thread will have "its own" variable.

:

__declspec(thread) static uint32 rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return rand = ( rand + 1 ) % NMAX; }
  1. In your particular case it's very easy to fix the code to make it thread-safe.

:

static long rand = NMAX/4; 
// this will be called multithreadedly
static uint32 GetRand() { return InterlockedIncrement(&rand) % NMAX; }
valdo