ansaurus

Question

Is there support in C++/STL for sorting objects by attribute?

Answer 1

+3 A:

If there's only one thing you're ever going to want to sort people by (or if there's a reasonable default that you're going to want to use most of the time), override operator< for the People class to sort by this attribute. Without an explicit comparator, STL sorting functions (and anything that makes implicit use of ordering, like sets and maps) will use operator<.

When you want to sort by something other than operator<, the way you describe is the only way to do it as of the current version of C++ (although the comparator can just be a regular function; it doesn't have to be a functor). The C++0x standard will make this less verbose by allowing lambda functions.

If you're not willing to wait for C++0x, an alternative is to use boost::lambda.

Tyler McHenry 2010-02-04 20:02:07

Sorting a vector vector of People *pointers* will never use People::operator<()

anon 2010-02-04 20:09:29

If you want to get technical, comparisons in the standard library don't use operator< directly -- they use `std::less<T>`, which uses `operator<` by default. You can, however, specialize `std::less<T>` for a type, and use whatever you want in that implementation.

Jerry Coffin 2010-02-04 20:15:44

Unfortunately it has to be a standalone function, not a member of the class.

Mark Ransom 2010-02-04 20:27:54

"C++0x standard will make this less verbose" - less verbose, possibly, far more horrible syntactically, definitely.

anon 2010-02-04 20:40:27

:) I don´t think it will be _that horrible_: `[]( Person* l, Person* r ) => bool { return l->age() < r->age(); }`. Well... never mind...

David Rodríguez - dribeas 2010-02-04 22:56:38

@David Rodríguez - dribeas: I'm with you, that doesn't look horrible at all. If only the compiler could infer the types of l and r, and the return type.

Manuel 2010-02-05 09:39:46

It can't: it needs the argument types _before_ passing it to std::sort(), yet it only knows the actual arguments when std::sort() passes them.

MSalters 2010-02-05 12:00:10

Answer 2

A:

You can have just a global function, or a static funtion. Each of these global or static functions compare against an attribute. No need to make a class. One way to hold papeters for comparisions is to use boost bind, but bind is only useful for finding all classes or comparing all classes against some bound parameter. Storing data across multiple elements is the only reason to make a functor.

Edit: also see boost lambda functions, but they are only practical for simple functions.

Chris H 2010-02-04 20:05:12

Answer 3

+6 A:

You don't need to create a class - just write a function:

#include <vector>
#include <algorithm>
using namespace std;

struct Person {
    int age;
    int getage() const {
        return age;
    }
};

bool cmp( const Person * a, const Person * b ) {
    return a->getage() < b->getage() ;
}

int main() {
    vector <Person*> v;
    sort( v.begin(), v.end(), cmp );
}

anon 2010-02-04 20:15:08

Just note that in many (most?) cases, the sort using the function will end up running slower.

Jerry Coffin 2010-02-04 20:20:09

Running slower than what ? Not sorting ?

leeeroy 2010-02-04 20:25:28

@leeeroy:Running slower than using a functor.

Jerry Coffin 2010-02-04 20:28:37

@Jeffry I did many tests using VC2008, and it inlines functions just like functors. Of course, this is not a rule, and I agree that inlining a function is a harder job for the compiler.

AraK 2010-02-04 20:28:52

@AraK: see my answer -- the code there demonstrates what I'm talking about. Don't get me wrong: there may be compiler flags that allow the two to match, but if so, I'm not at all sure most people know what flags to use, or do so most of the time.

Jerry Coffin 2010-02-04 21:02:30

Answer 4

+8 A:

Generic adaptor to compare based on member attributes. While it is quite more verbose the first time it is reusable.

// Generic member less than
template <typename T, typename M, typename C>
struct member_lt_type 
{
   typedef M T::* member_ptr;
   member_lt_type( member_ptr p, C c ) : ptr(p), cmp(c) {}
   bool operator()( T const & lhs, T const & rhs ) const 
   {
      return cmp( lhs.*ptr, rhs.*ptr );
   }
   member_ptr ptr;
   C cmp;
};

// dereference adaptor
template <typename T, typename C>
struct dereferrer
{
   dereferrer( C cmp ) : cmp(cmp) {}
   bool operator()( T * lhs, T * rhs ) const {
      return cmp( *lhs, *rhs );
   }
   C cmp;
};

// syntactic sugar
template <typename T, typename M>
member_lt_type<T,M, std::less<M> > member_lt( M T::*ptr ) {
   return member_lt_type<T,M, std::less<M> >(ptr, std::less<M>() );
}

template <typename T, typename M, typename C>
member_lt_type<T,M,C> member_lt( M T::*ptr, C cmp ) {
   return member_lt_type<T,M,C>( ptr, cmp );
}

template <typename T, typename C>
dereferrer<T,C> deref( C cmp ) {
   return dereferrer<T,C>( cmp );
}

// usage:    
struct test { int x; }
int main() {
   std::vector<test> v;
   std::sort( v.begin(), v.end(), member_lt( &test::x ) );
   std::sort( v.begin(), v.end(), member_lt( &test::x, std::greater<int>() ) );

   std::vector<test*> vp;
   std::sort( v.begin(), v.end(), deref<test>( member_lt( &test::x ) ) );
}

David Rodríguez - dribeas 2010-02-04 20:24:04

Makes my head hurt to read it, but I love the idea.

Mark Ransom 2010-02-04 21:36:12

Very cool. I was able to extend this to use getter methods.

James Dean 2010-02-05 15:14:27

Answer 5

+1 A:

I see that dribeas already posted that idea, but since I already wrote it, here's how you'd write a generic comparator to use getter functions.

#include <functional>

template <class Object, class ResultType>
class CompareAttributeT: public std::binary_function<const Object*, const Object*, bool>
{
    typedef ResultType (Object::*Getter)() const;
    Getter getter;
public:
    CompareAttributeT(Getter method): getter(method) {}
    bool operator()(const Object* lhv, const Object* rhv) const
    {
        return (lhv->*getter)() < (rhv->*getter)();
    }
};

template <class Object, class ResultType>
CompareAttributeT<Object, ResultType> CompareAttribute(ResultType (Object::*getter)() const)
{
    return CompareAttributeT<Object, ResultType>(getter);
}

Usage:

std::sort(people.begin(), people.end(), CompareAttribute(&Person::getAge));

I think it might be a good idea to overload operator() for non-pointers, but then one couldn't typedef the argument_types by inheriting from binary_function - which is probably not a great loss, since it would hard to use it where those are needed anyway, for example, one just couldn't negate the comparison functor anyway.

UncleBens 2010-02-04 20:34:53

Since the standard already has "mem_fn" and "mem_fun", it could be a good idea to call this "mem_fn_cmp" instead of "CompareAttribute".

Manuel 2010-02-05 09:43:34

Whatever you like (while it is a nice thing, I generally just end up using boost.bind or the like myself).

UncleBens 2010-02-05 15:37:26

Answer 6

+1 A:

This isn't really so much an answer in itself, as a reply to AraK's reply to my comment that sorting with a function instead of a functor can be slower. Here's some (admittedly ugly -- far too much CnP) test code that compares various sorting: qsort, std::sort of vector vs. array, and std::sort using a template class, template function, or plain function for comparison:

#include <vector>
#include <algorithm>
#include <stdlib.h>
#include <time.h>

int compare(void const *a, void const *b) {
    if (*(int *)a > *(int *)b)
        return -1;
    if (*(int *)a == *(int *)b)
        return 0;
    return 1;
}

const int size = 200000;

typedef unsigned long ul;

void report(char const *title, clock_t ticks) { 
    printf("%s took %f seconds\n", title, ticks/(double)CLOCKS_PER_SEC);
}

void wait() { 
    while (clock() == clock())
        ;
}

template <class T>
struct cmp1 { 
    bool operator()(T const &a, T const &b) { 
        return a < b;
    }
};

template <class T>
bool cmp2(T const &a, T const &b) { 
    return a<b;
}

bool cmp3(int a, int b) { 
    return a<b;
}

int main(void) {
    static int array1[size];
    static int array2[size];

    srand(time(NULL));

    for (int i=0; i<size; i++) 
        array1[i] = rand();

    const int iterations = 100;

    clock_t total = 0;

    for (int i=0; i<iterations; i++) { 
        memcpy(array2, array1, sizeof(array1));
        wait();
        clock_t start = clock();
        qsort(array2, size, sizeof(array2[0]), compare);
        total += clock()-start;
    }
    report("qsort", total);

    total = 0;
    for (int i=0; i<iterations; i++) {
        memcpy(array2, array1, sizeof(array1));
        wait();
        clock_t start = clock();
        std::sort(array2, array2+size);
        total += clock()- start;
    }
    report("std::sort (array)", total);

    total = 0;
    for (int i=0; i<iterations; i++) {
        memcpy(array2, array1, sizeof(array1));
        wait();
        clock_t start = clock();
        std::sort(array2, array2+size, cmp1<int>());
        total += clock()- start;
    }
    report("std::sort (template class comparator)", total);

    total = 0;
    for (int i=0; i<iterations; i++) {
        memcpy(array2, array1, sizeof(array1));
        wait();
        clock_t start = clock();
        std::sort(array2, array2+size, cmp2<int>);
        total += clock()- start;
    }
    report("std::sort (template func comparator)", total);

    total = 0;
    for (int i=0; i<iterations; i++) {
        memcpy(array2, array1, sizeof(array1));
        wait();
        clock_t start = clock();
        std::sort(array2, array2+size, cmp3);
        total += clock()- start;
    }
    report("std::sort (func comparator)", total);

    total = 0;
    for (int i=0; i<iterations; i++) {
        std::vector<int> array3(array1, array1+size);
        wait();
        clock_t start = clock();
        std::sort(array3.begin(), array3.end());
        total += clock()-start;
    }
    report("std::sort (vector)", total);

    return 0;
}

Compiling this with VC++ 9/VS 2008 using cl /O2b2 /GL sortbench3.cpp, I get:

qsort took 3.393000 seconds
std::sort (array) took 1.724000 seconds
std::sort (template class comparator) took 1.725000 seconds
std::sort (template func comparator) took 2.725000 seconds
std::sort (func comparator) took 2.505000 seconds
std::sort (vector) took 1.721000 seconds

I believe these fall fairly cleanly into three groups: using sort with the default comparison, and using the template class produced the fastest code. Using either the function or template function is clearly slower. Using qsort is (surprisingly to some) the slowest of all, by around a 2:1 margin.

The difference between cmp2 and cmp3 appears to stem entirely from passing by reference vs. value -- if you change cmp2 to take its arguments by value, its speed matches cmp3 exactly (at least in my testing). The difference is that if you know the type is going to be int, you'll almost certainly pass by value, whereas for generic T, you'll usually pass by const reference (just in case it's something that's more expensive to copy).

Jerry Coffin 2010-02-04 20:58:59

@Jerry I agree with you, that using a functor is the right thing to do. I prefer this solution of course, but as I said, my (poor) tests are not a rule. +1 for showing your test results. BTW, sorry for misspelling your first name in my first comment :)

AraK 2010-02-04 21:25:28

@AraK:I'm not so much pushing the idea that the functor is the "right" thing to do, as that there can be a tradeoff, so the "verbosity" of a functor *may* be worthwhile. Quite all right about the spelling -- it seems to happen fairly regularly (though more often with my last name -- when I was a kid, I delivered newspapers, and people found paying with a check found truly creative ways to spell it...)

Jerry Coffin 2010-02-04 21:40:27

There is one more difference between cmp3 and previous comparators. cmp3 takes arguments by value. Could You please try a function cmp4 that takes ints by a reference?

Maciej Hehl 2010-02-05 01:16:56

@Maciej H:Sure. The normal function taking its arguments by reference slows down somewhat, so it's slightly slower than the function template taking its arguments by reference (not by a lot, but the difference appears quite dependable -- I get ~2.8 seconds, vs. ~2.7 for the function template, with both taking arguments by reference).

Jerry Coffin 2010-02-05 04:23:11

Thank You. I must admit, that at first I didn't realise, that the function template was actually the test I was asking for. Now I'm surprised by the difference :). Oh well, I got my lesson.

Maciej Hehl 2010-02-06 00:50:42

Answer 7

A:

I just tried this based on UncleBens and david-rodriguez-dribeas ideas.

This seems to work (as is) with my current compiler. g++ 3.2.3. Please let me know if it works on other compilers.

#include <vector>
#include <algorithm>
#include <iostream>

using namespace std;

class Person
{
public:
    Person( int _age )
        :age(_age)
    {
    }
    int getAge() const { return age; }
private:
    int age;
};

template <typename T, typename ResType>
class get_lt_type
{
    ResType (T::*getter) () const;
public:
    get_lt_type(ResType (T::*method) () const ):getter(method) {}
    bool operator() ( const T* pT1, const T* pT2 ) const
    {
        return (pT1->*getter)() < (pT2->*getter)();
    }
};

template <typename T, typename ResType>
get_lt_type<T,ResType> get_lt( ResType (T::*getter) () const ) {
    return get_lt_type<T, ResType>( getter );
}

int main() {
    vector<Person*> people;
    people.push_back( new Person( 54 ) );
    people.push_back( new Person( 4 ) );
    people.push_back( new Person( 14 ) );

    sort( people.begin(), people.end(), get_lt( &Person::getAge) );

    for ( size_t i = 0; i < people.size(); ++i )
    {
        cout << people[i]->getAge() << endl;
    }
    // yes leaking Persons
    return 0;
}

James Dean 2010-02-04 22:19:39

ansaurus

tags:

views:

answers:

Is there support in C++/STL for sorting objects by attribute?

related questions