tags:

views:

217

answers:

9

Suppose I want to put objects that identify a server into a stl set. Then I would have to make sure that I also implement operator< for these objects otherwise I would run into a compiler error:

struct ServerID
{
  std::string name; // name of the server
  int port;
};

std::set<ServerID> servers; // compiler error, no operator< defined

This is just one example of a common problem where I want to make an object comparable.

My current solution usually goes like this:

bool operator< (const ServerID & lhs, const ServerID & rhs)
{
  if (lhs.name != rhs.name)
  {
    return lhs.name < rhs.name;
  }
  else
  {
    return lhs.port < rhs.port;
  }
}

This is just a solution that I found myself. But I suspect that this problem might also have been recognized in computer science. So if I'm lucky there is a better solution for this. Can anyone hint me towards that?

+2  A: 

At the end of the day, you simply have to come up with a comparison function which meets your immediate needs. This can be difficult - for example, how would you compare two bitmaps of different sizes?

anon
The hacker in me says:if (bmp1.width != bmp2.width) return bmp1.width < bmp2.width;else if (bmp1.height != bmp2.height) return bmp1.height < bmp2.height;else return compareBitmaps(bmp1, bmp2);Where compareBitmaps uses the same comparison pattern as above for each pixel value.Ok, that would potentially be REALLY slow for bitmaps that only differ at the bottom-right pixel :D
StackedCrooked
Unfortunately, unless you want those bitmaps to compare equal (in which case only one of them could be placed in a std::set) , you have to provide a function that can compare every pixel.
anon
+4  A: 

Yours is the canonical solution. I'm not sure how you might like to do it in a way that would be better.

To expand on this, if you have n members in your class you will find that you have to compare some number of these fields in order to establish a strict ordering. There is no real way around this, although you might find that it is possible to make the comparison function perform better (in terms of the average complexity) if you order the comparisons so that the ones that are more likely to contribute to the success of the comparison will come first. This helps it to drop out of the comparison faster.

A possibility that can help in some circumstances (if you find that performance is dominated by comparisons) is to establish a "sort key" - comparing strings can be expensive. A sort key is a integer that can be used to do a fast comparison of the objects. If the sort key compares less than, then the string would do also.

In your case, a simplistic sort key might involve treating the binary representation of the strings as integers - this has many bugs by the way - and then comparing the integers instead of the strings.

In Windows, the LCMapString function can be used to produce a sort key for strings in this way. I think that you can then use a fast function like memcmp to compare the strings instead of a slower string comparison. This is more useful if you would be doing case insensitive comparisons or using the full range of unicode characters and wanted correct comparisons according to their rules.

1800 INFORMATION
If a class has n members, then my operator< implementation would also have n comparisons. I wondered if there might be a quicker algorithm.
StackedCrooked
I added some detail on the concept of sort keys - it is unlikely to be of too much assistance in your case
1800 INFORMATION
Thanks your answer is actually quite informative.
StackedCrooked
+2  A: 

If all you need is a well order, and you don't care one way or the other what that order is, then your solution is fine.

Thom Smith
+3  A: 

I usually write it as:

return x.name < y.name ||
       x.name == y.name && x.port < y.port;

... which you can continue to expand for as many member variables you have. This solution is short-circuited as soon as possible and eliminates branching.

Note that this requires operator< be defined for each of the member variables, which is a good thing to have implemented outside of this routine anyhow.

fbrereto
A: 

I tend to put the operator< inside the struct to keep things neater, but otherwise you have exactly what I put.

gbjbaanb
+2  A: 

In this case if the order doesn't matter you may want to compare the port before the string due to the cost of a string comparison.

Graphics Noob
A: 

Your solution is pretty much the right way to do it. Your comparison function should be set up to uniquely identify each server (what this actually means depends on your use-case), so comparing name/port is probably sufficient.

If you know that you won't have two servers with the same name but different ports (or you would want to treat these the same) then you can drop that part of your comparison function. A more realistic example would be if you had more members in your server object that didn't have to do with the identity of the server itself (e.g. a "last-request" cache); in this case you probably wouldn't want your set to distinguish based on this field, so you wouldn't include it in your comparison function. OTOH, this may not be the best design for the server object anyway.

If you are finding it difficult to answer the question "When should two servers (objects) be considered identical?" then you may not really need a set at all.

Sumudu Fernando
+10  A: 

I would recommend not implementing it as operator<, to avoid possible confusion, but rather pass the order function as a parameter to the std::set template argument.

struct server
{
   std::string name;
   int port;
};
struct name_then_port : public std::binary_function<server,server,bool>
{
   bool operator()( server const & lhs, server const & rhs ) {
      // using litb approach (more efficient as it does not call both < and == on strings:
      int cmp = lhs.name.compare(rhs.name);
      return ( cmp < 0 ) || ((cmp==0) && ( lhs.port < rhs.port));
   }
};
struct port_then_name : public std::binary_function<server,server,bool>
{
   bool operator()( server const & lhs, server const & rhs ) {
      return (lhs.port < rhs.port) || ((lhs.port==rhs.port) && (lhs.name<rhs.name));
   }
};
int main()
{
   std::set< server, name_then_port > servers; // or:
   std::set< server, port_then_name > servers2;
}

About the question of whether this problem has been identified before, it has. The general solution is exactly what you posted: lexicographical order. While the term is usually referred to string ordering, but the ordering is the same: take the first element, compare if it does not define an order take the next data element and iterate.

David Rodríguez - dribeas
I hadn't thought of this option. It's not exactly an answer to my question, but thanks anyway :)
StackedCrooked
It was an answer to the question in the title. I have updated the answer so that it covers the last part of your question.
David Rodríguez - dribeas
giving you a +1 too for being first with the functor approach xD
Johannes Schaub - litb
+3  A: 

I would use string::compare

bool operator< (const ServerID & lhs, const ServerID & rhs) {
  int lcr = lhs.name.compare(rhs.name);
  return lcr < 0 || (lcr == 0 && lhs.port < rhs.port);
}

If it does not make sense to you to have it comparable, and the only use of that would be to stuff it into the set, you can use a functor

struct ServerIdCompare {
  bool operator()(const ServerID & lhs, const ServerID & rhs) const {
    int lcr = lhs.name.compare(rhs.name);
    return lcr < 0 || (lcr == 0 && lhs.port < rhs.port);
  }
};

std::set<ServerID, ServerIdCompare> servers;

If you however provide the operator independent (not using the functor) like above, then also provide <=, ==, >= and != to keep it consistent.

Johannes Schaub - litb
+1 on the use of string::compare to avoid having the (possibly expensive) comparison twice.
David Rodríguez - dribeas
+1 on maintaining coherence by providing all comparison operators if one of them is defined (well, more or a virtual +1, I cannot upvote twice)
David Rodríguez - dribeas
oh, i appreciate it. thanks dude!
Johannes Schaub - litb