tags:

views:

136

answers:

3

I wrote this routine to order items, keep only unique items, where it takes in an array of type T, and the size of the array. It returns the new size of the array after processing.

template <class T>
int reduce(T array[], int size) {
    T *begin = array;
    T *end = array + size;
    sort(begin, end);
    T *end_new = unique(begin, end);
    return end_new - array;
}

My question is I was expecting it to sort const char *data like

{"aa", "bb", "bc", "ca", "bc", "aa", "cc", "cd", "ca", "bb"};
into //aa bb bc ca cc cd

However it does it the opposite way, : "cd cc ca bc bb aa" Why does it do that? Does it not use the standard string comparisons? If I wanted to, how could I alter it so it would order const char * alphabetically? thanks.

+6  A: 

sort() uses operator< per default, which would just compare the addresses in your case.

If you want to sort C-strings, you have to pass a comparator to sort(). To do this generically you can let the user pass a comparator, use specialization on a comparator function or a combination of these:

template<class T> bool my_comp(T a, T b) {
    return a < b;
}

template<> bool my_comp<const char*>(const char* a, const char* b) {
    return std::strcmp(a, b) < 0;
}

template<class T, class Comp> 
int reduce(T array[], size_t size, Comp comp = my_comp<T>) {
    // ...
    std::sort(begin, end, comp);
    // ...
}
Georg Fritzsche
+1  A: 

const char*'s < operator performs pointer comparison, not string data comparison. Either use std::string for your string data, or specialize reduce so that it calls sort with a special comparator based on strcmp. I'm guessing you got the output you did because your compiler decided to reverse-alphabetize all of its string constants in memory.

unique also isn't doing anything -- it only works in the first place because your compiler pooled all of the strings in memory at compile time, so that all of the "bb" strings would use the same memory. If you read the exact same strings from a file, your array wouldn't change. The solutions to this problem are the same.

Ken Bloom
+2  A: 

std::sort uses < by default, < on char*s has nothing to do with their lexicographic ordering. You can provide an additional parameter to sort to tell it how to compare, or you can use an array of std::string or similar instead of char*.

Logan Capaldo