views:

371

answers:

2

Hi,

I'm a new to C and got stuck with subj. I can split string with strtok but I don't know how to get a random token.

Thanks.

+4  A: 

You can parse it twice, then get a random number and pick one, which you collect on the second pass of the same string.

Or, you can do it in a single pass if you use reservoir sampling.

Mastering reservoir sampling will be a very useful way of learning C as a side to learning some maths! :)

Will
Reservoir sampling looks cool!
j_random_hacker
Agreed. Looks cool and useful. Thanks for the answer!
Dmitry Gladkov
+1  A: 

The following pseudocode shows how to return a candidate uniformly selected among the tokens of the string:

string result = null;
int tokens = 0;
while (true) {
  string candidate = next token;
  if (candidate does not exist) break;
  tokens = tokens + 1;
  if ((a random integer selected between 0 and tokens-1) == 0) result = token;
}
return result;

This is a special case of Algorithm R from section 3.4.2 of Volume II of Knuth's The Art of Computer Programming.

Neal Gafter
The proper name for this algorithm is reservoir sampling, specialised to selecting a single element, whereas reservoir sampling can of course be used for tracking a representative sample in a stream
Will