views:

140

answers:

6

I wrote this code in C++ as part of a uni task where I need to ensure that there are no duplicates within an array:

// Check for duplicate numbers in user inputted data
    int i; // Need to declare i here so that it can be accessed by the 'inner' loop that starts on line 21
    for(i = 0;i < 6; i++) { // Check each other number in the array
        for(int j = i; j < 6; j++) { // Check the rest of the numbers
            if(j != i) { // Makes sure don't check number against itself
                if(userNumbers[i] == userNumbers[j]) {
                    b = true;
                }
            }
            if(b == true) { // If there is a duplicate, change that particular number
                cout << "Please re-enter number " << i + 1 << ". Duplicate numbers are not allowed:" << endl;
                cin >> userNumbers[i];
            }
        } // Comparison loop
        b = false; // Reset the boolean after each number entered has been checked
    } // Main check loop

It works perfectly, but I'd like to know if there is a more elegant or efficient way to check.

+3  A: 

You can add all elements in a set and check when adding if it is already present or not. That would be more elegant and efficient.

Benoit Thiery
How do you do that? Add in a set I mean.
Saladin Akara
@Saladin Akara: have a look at std:set, it's part of STL.
kriss
Thanks kriss +1
Saladin Akara
Just a note: You don't have to check with std::set, you can just call insert and if there's a dupe, it'll magically disappear.
DeadMG
Of course, `std::set` can quickly become inefficient for larger data sets, which is one of the reasons why `std::unordered_set` is being added in C++0x. If you can use it, the latter can become much faster for larger data sets.
Jonathan Grynspan
+5  A: 

You could sort the array in O(nlog(n)), then simply look until the next number. That is substantially faster than your O(n^2) existing algorithm. The code is also a lot cleaner. Your code also doesn't ensure no duplicates were inserted when they were re-entered. You need to prevent duplicates from existing in the first place.

std::sort(userNumbers.begin(), userNumbers.end());
for(int i = 0; i < userNumbers.size() - 1; i++) {
    if (userNumbers[i] == userNumbers[i + 1]) {
        userNumbers.erase(userNumbers.at(i))
        i--;
    }
}

I also second the reccomendation to use a std::set - no duplicates there.

DeadMG
But if sort is O (n * log( n ) ) and then you have to do a O( n ) check of the array to find the duplicates after isn't your complexity then O( n^2 * log( n ) )?
Goz
No, it's O(n*log(n) + n) - you sort THEN search, not sort AND search for every operation of the sort.
DeadMG
This is certainly faster as 6 approaches infinity ;-)
Steve Jessop
@Steve: Whoops, didn't notice that he was hardcoded at 6.
DeadMG
..Which means it's O(n)
paul_71
@DeadMG: I do agree with your answer in general :-) That 6 (and the interactive replacement of dupes) might just be example code, but if not then I'd probably stick with what the questioner has, since it preserves the array in the order it was provided, without moving anything or taking a copy.
Steve Jessop
Correction: it should be "it's O(n log n)", of course. The solution presented above is O(n^2), however...
paul_71
@paul_71: It is O(n log n). You sort once, then you iterate once. That's O(n log n).
DeadMG
@DeadMG: You are right of course. I was multiplying when I should have been adding.
Goz
@DeadMG No, it is not. This were true, if std::vector<T>::erase was O(1) which it is not. It is O(n) in general, which makes your solution O(n^2) in total. Though n is decreasing as you process the array (assume you have each element occuring twice) you have O(1) + O(n-1) for each call of erase. For a O(n log n) solution (which I think is the lower bound) see my answer below.
paul_71
@paul_71: The cost of the erase is not part of the algorithm, as if you'll notice, the OP had a different action on duplicate detection which is not O(n). It was merely an example. It would be trivial to, for example, simply create a new array and O(1) insert into it, then replace the original with the new.
DeadMG
Right, if the pure duplicate detection suffices, sorting and a linear walk through the sequence is enough. Removing duplicates is a different cup of tea - and the way suggested above is O(n^2) and does not preserve the original ordering of the elements...
paul_71
@paul_71: Question only asked for detection.
DeadMG
+2  A: 

It's ok, specially for small array lengths. I'd use more efficient aproaches (less than n^2/2 comparisons) if the array is mugh bigger - see DeadMG's answer.

Some small corrections for your code:

  • Instead of int j = i writeint j = i +1 and you can omit your if(j != i) test
  • You should't need to declare i variable outside the for statement.
leonbloy
I needed to declare `i` outside the first loop because I'd get an `i was not declared in this scope` error when I use it in the 'inner' for loop. Wasn't sure why it did that, but declaring outside the loop fixed the problem
Saladin Akara
@Saladin: It's a bug in your compiler. Declaring i inside the first for loop should make it accessible in the second.
DeadMG
Ah, okay. Thanks for the clarification
Saladin Akara
@Saladin: And I am guessing your compiler is the < a very bad word here > Visual Studio 6.0, isn't it?
Armen Tsirunyan
@Armen Nope - Using the GCC g++ mode.
Saladin Akara
+1  A: 

Indeed, the fastest and as far I can see most elegant method is as advised above:

std::vector<int> tUserNumbers;
// ...
std::set<int> tSet(tUserNumbers.begin(), tUserNumbers.end());
std::vector<int>(tSet.begin(), tSet.end()).swap(tUserNumbers);

It is O(n log n). This however does not make it, if the ordering of the numbers in the input array needs to be kept... In this case I did:

    std::set<int> tTmp;
    std::vector<int>::iterator tNewEnd = 
        std::remove_if(tUserNumbers.begin(), tUserNumbers.end(), 
        [&tTmp] (int pNumber) -> bool {
            return (!tTmp.insert(pNumber).second);
    });
    tUserNumbers.erase(tNewEnd, tUserNumbers.end());

which is still O(n log n) and keeps the original ordering of elements in tUserNumbers.

Cheers,

Paul

paul_71
+1  A: 

The following solution is based on sorting the numbers and then removing the duplicates:

#include <algorithm>

int main()
{
    int userNumbers[6];

    // ...

    int* end = userNumbers + 6;
    std::sort(userNumbers, end);
    bool containsDuplicates = (std::unique(userNumbers, end) != end);
}
FredOverflow
A: 
//std::unique(_copy) requires a sorted container.
std::sort(cont.begin(), cont.end());

//testing if cont has duplicates
std::unique(cont.begin(), cont.end()) != cont.end();

//getting a new container with no duplicates
std::unique_copy(cont.begin(), cont.end(), std::back_inserter(cont2));
coh