Let's say that I have a list of data: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} where n = 10 elements
I'd like to randomly choose k elements of this set to form a sublist, say k = 5.
In that case, I could end up with a sublist that looks like {9, 3, 5, 2, 7}
I could accomplish this by:
- Randomly determining an offset within the list, between 0 and the current size of the list minus 1
- Appending that element to my sublist
- Erasing that element from the original list
- Repeat until the desired size is found
The problem with this is that as the original list grows the offset and deletion time grows as well, and for any significantly large list (say over 1,000,000 elements), it takes quite a long time to perform this algorithm.
Is there a faster way to generate a random sequence from a list of given data? The implementation of the random number generator should be set aside for this problem, instead, focusing on how the RNG result is used in a proposed algorithm.
Any thoughts?
Right now I'm using the C++ STL list