How do I fill an integer array with unique values (no duplicates) in C?
int vektor[10];
for (i = 0; i < 10; i++) {
vektor[i] = rand() % 100 + 1;
}
//No uniqueness here
How do I fill an integer array with unique values (no duplicates) in C?
int vektor[10];
for (i = 0; i < 10; i++) {
vektor[i] = rand() % 100 + 1;
}
//No uniqueness here
One way would be to check if the array already contains the new random number, and if it does, make a new one and try again.
This opens up for the (random ;) ) possibility that you'd never get a number which is not in the array. Therefore you should count how many times you check if the number is already in the array, and if the count exceeds MAX_DUPLICATE_COUNT, throw an exception or so :) (EDIT, saw you're in C. Forget the exceptionpart :) Return an error code instead :P )
I think this will do it (I've not tried to build it, so syntax errors are left to fix as an exercise for the reader). There might be more elegant ways, but this is the brute force solution:
int vektor[10];
int random;
int uniqueflag;
int i, j
for(i = 0; i < 10; i++) {
/* Assume things are unique... we'll reset this flag if not. */
uniqueflag = 1;
do {
random = rand() % 100+ 1;
/* This loop checks for uniqueness */
for (j = 0; j < i && uniqueflag == 1; j++) {
if (vektor[j] == random) {
uniqueflag = 0;
}
}
} while (uniqueflag != 1);
vektor[i] = random;
}
In your example (choose 10 unique random numbers between 1 and 100), you could create a list with the numbers 1 to 100, use the random number generator to shuffle the list, and then take the first 10 values from the list.
int list[100], vektor[10];
for (i = 0; i < 100; i++) {
list[i] = i;
}
for (i = 0; i < 100; i++) {
int j = i + rand() % (100 - i);
int temp = list[i];
list[i] = list[j];
list[j] = temp;
}
for (i = 0; i < 10; i++) {
vektor[i] = list[i];
}
Based on cobbal's comment below, it is even better to just say:
for (i = 0; i < 10; i++) {
int j = i + rand() % (100 - i);
int temp = list[i];
list[i] = list[j];
list[j] = temp;
vektor[i] = list[i];
}
Now it is O(N) to set up the list but O(M) to choose the random elements.
An quick solution is to create a mask array of all possible numbers initialized to zeros, and set an entry if that number is generated
int rand_array[100] = {0};
int vektor[10];
int i=0, rnd;
while(i<10) {
rnd = rand() % 100+ 1;
if ( rand_array[rnd-1] == 0 ) {
vektor[i++] = rnd;
rand_array[rnd-1] = 1;
}
}
A simple way would be to keep a record of the numbers you've already used. In your case, you appear to be interested in numbers between 1 and 100. So we have an array which specifies whether we have seen these numbers before. Then when we generate a random number we haven't seen before, we keep it and mark it as seen. If we generate one which we have seen before, we simply get another one.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define MAX 100 /* Values will be in the range (1 .. MAX) */
static int seen[MAX]; /* These are automatically initialised to zero
by the compiler because they are static. */
static int vektor[10];
int main (void) {
int i;
srand(time(NULL)); /* Seed the random number generator. */
for (i=0; i<10; i++) {
int r;
do {
r = rand() / (RAND_MAX / MAX + 1);
} while (seen[r]);
seen[r] = 1;
vektor[i] = r + 1;
}
for (i=0; i<10; i++)
printf("%i\n", vektor[i]);
return 0;
}
Note that I have also changed your use of rand()
to give more random results. See the comp.lang.c FAQ list question 13.16 for details, if you're interested.
There are several ways to solve your problem, each has its own advantages and disadvantages.
First I'd like to note that you already got quite a few of responses that do the following: they generate a random number, then check somehow whether it was already used in the array, and if it was already used, they just generate another number until they find an unused one. This is a naive and, truth to be said, seriously flawed approach. The problem is with the cyclic nature of the number generation ("if used, try again"). If the numeric range (say, [1..N]) is close to the length of the desired array (say, M), then towards the end the algorithm might spend a huge amount of time trying to find the next number. If the random number generator is even a little bit broken (say, never generates some number, or does it very rarely), then with N == M the algorithm is guaranteed to loop forever (or for a very long time). Generally this "trial and error" approach is a useless one, or flawed at best.
Another approach already presented here is generating a random permutation in an array of size N. The idea of random permutation is a promising one, but doing it on an array of size N (when M << N) is an atrocity.
The proper solutions to the problem can be found, for example, in Bentley's "Programming Pearls" (and some of them are taken from Knuth).
vektor
array (meaning that it takes O(M) memory, not O(N) as other permutation-based algorithm suggested here). The latter makes it a viable algorithm even for M << N cases.The algorithm works as follows: iterate through all numbers from 1 to N and select the current number with probability rm / rn
, where rm
is how many numbers we still need to find, and rn
is how many numbers we still need to iterate through. Here's a possible implementation for your case
#define M 10
#define N 100
int in, im;
im = 0;
for (in = 0; in < N && im < M; ++in) {
int rn = N - in;
int rm = M - im;
if (rand() % rn < rm)
/* Take it */
vektor[im++] = in + 1; /* +1 since your range begins from 1 */
}
assert(im == M);
After this cycle we get an array vektor
filled with randomly chosen numbers in ascending order. The "ascending order" bit is what we don't need here. So, in order to "fix" that we just make a random permutation of elements of vektor
and we are done. Note, that the this is a O(M) permutation with no extra memory. (I leave out the implementation of the permutation algorithm. Plenty of links was given here already.).
If you look carefully at those the permutation-based algorithms proposed here, that operate on an array of length N, you'll see that most of them are pretty much this very same Knuth algorithm, but re-formulated for M == N
. In that case the above selection cycle will chose each and every number in [1..N] range with probabilty 1, effectively turning into initialization of an N-array with numbers 1 to N. Taking this into account, I think everybody will agree that running this algorithm for M == N
and then truncating the result makes much less sense than just running this algorithm in its original form for the original value of M and getting the result right away, without any truncation.
Here's a possible implementation for it for your case. (There are different ways to keep track of already used numbers. I'll just use an array of flags, assuming that N is not exceedingly large)
#define M 10
#define N 100
unsigned char is_used[N] = { 0 }; /* flags */
int in, im;
im = 0;
for (in = N - M; in < N && im < M; ++in) {
int r = rand() % (in + 1);
if (is_used[r])
r = in; /* use 'in' instead of generated number */
assert(!is_used[r]);
vektor[im++] = r + 1; /* +1 since your range begins from 1 */
is_used[r] = 1;
}
assert(im == M);
Why the above works is not immediately obvious. But it works. Exactly M numbers from [1..N] range will be picked with uniform distribution.
Note, that for large N you can use a search-based structure to store "used" numbers, thus getting a nice O(M log M) algorithm with O(M) memory requirement.
(There's one thing about this algorithm though: while the resultant array will not be ordered, a certain "influence" of the original 1..N ordering will still be present in the result. For example, it is obvious that number N, if selected, can only be the very last member of the resultant array. If this "hint" of ordering is not acceptable, the resultant vektor
array can be random-shuffled, just like in the Khuth algorithm).
Note the very critical point observed in the design of these two algoritms: they never loop, trying to find a new unused random number. Any algorithm that makes "trial and error" iterations with random numbers is flawed from practical point of view. Also, the memory consumption of these algorithms is tied to M, not to N
To the OP I would recommend the Floyd's algorithm, since in his application M seems to be considerably less than N and that it doesn't (or may not) require an extra pass for permutation. However, for such small values of N the difference might be negligible.
Simply generating random numbers and seeing whether they are OK is a poor way to solve this problem in general. This approach takes all the possible values, shuffles them and then takes the top ten. This is directly analogous to shuffling a deck of cards and dealing off the top.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define randrange(N) rand() / (RAND_MAX/(N) + 1)
#define MAX 100 /* Values will be in the range (1 .. MAX) */
static int vektor[10];
int candidates[MAX];
int main (void) {
int i;
srand(time(NULL)); /* Seed the random number generator. */
for (i=0; i<MAX; i++)
candidates[i] = i;
for (i = 0; i < MAX-1; i++) {
int c = randrange(MAX-i);
int t = candidates[i];
candidates[i] = candidates[i+c];
candidates[i+c] = t;
}
for (i=0; i<10; i++)
vektor[i] = candidates[i] + 1;
for (i=0; i<10; i++)
printf("%i\n", vektor[i]);
return 0;
}
For more information, see comp.lang.c FAQ list question 13.19 for shuffling and question 13.16 about generating random numbers.
Generate first and second digits separately. Shuffle them later if required. (syntax from memory)
int vektor[10];
int i = 0;
while(i < 10) {
int j = rand() % 10;
if (vektor[j] == 0) { vektor[j] = rand() % 10 + j * 10; i ++;}
}
However, the numbers will be nearly apart by n, 0 < n < 10.
Or else, you need to keep the numbers sorted (O(n log n)
), so that newly generated can be quickly checked for presence (O(log n)
).