ansaurus

Question

Answer 1

A:

Why don't you try moving the data into a STL Set ? you need only to implement the comparison function, and you will end up with a perfectly ordered set of data very fast.

John Paul 2010-07-25 18:22:42

Good idea - but the ordering of my_data_vec is significant and arbitrary. The ordering information is lost by a set.

AshleysBrain 2010-07-25 18:25:49

"Why don't you try moving the data into a STL Set?" Because then they wouldn't be in th order specified by `my_data_ids`.

sbi 2010-07-25 18:27:47

Answer 2

A:

Why don't you just use a map<int, unique_ptr<MyData>> (or multimap)?

rlbond 2010-07-25 18:27:03

Answer 3

+4 A:

Create a map that maps ids to their index in my_data_ids.
Create a function object that compares std::unique_ptr<MyData> based on their ID's index in that map.
Use std::sort to sort the my_data_vec using that function object.

Here's a sketch of this:

// Beware, brain-compiled code ahead!
typedef std::vector<int> my_data_ids_type;
typedef std::map<int,my_data_ids_type::size_type> my_data_ids_map_type;

class my_id_comparator : public std::binary_function< bool
                                                    , std::unique_ptr<MyData>
                                                    , std::unique_ptr<MyData> > {
public:
  my_id_comparator(const my_data_ids_map_type& my_data_ids_map)
    : my_data_ids_map_(my_data_ids_map) {}

  bool operator()( const std::unique_ptr<MyData>& lhs
                 , const std::unique_ptr<MyData>& rhs ) const
  {
     my_data_ids_map_type::const_iterator it_lhs = my_data_ids_map_.find(lhs.id);
     my_data_ids_map_type::const_iterator it_rhs = my_data_ids_map_.find(rhs.id);
     if( it_lhs == my_data_ids_map_.end() || it_rhs == my_data_ids_map_.end() )
       throw "dammit!"; // whatever
     return it_lhs->second < it_rhs->second;
  }
private
  my_data_ids_map_type& my_data_ids_map_;
};

//...

my_data_ids_map_type my_data_ids_map;
// ...
// populate my_data_ids_map with the IDs and their indexes from my_data_ids
// ...
std::sort( my_data_vec.begin(), my_data_vec.end(), my_id_comparator(my_data_ids_map) );

If memory is scarce, but time doesn't matter, you could do away with the map and search the IDs in the my_data_ids vector for each comparison. However, you would have to be really desperate for memory to do that, since two linearly complex operations per comparison are going to be quite expensive.

sbi 2010-07-25 18:32:15

This is a very nice solution, thanks! Didn't think `sort` would be applicable but you proved me wrong :) (btw I'm using C++0x so I used a lambda in the sort - keeps things a bit tidier)

AshleysBrain 2010-07-25 18:45:49

@James: I'm not sure throwing a `std::exception` is right at this place. The way the problem was presented, it seems like there __must__ be an entry in `my_data_ids` for each ID. If so, then that should probably be an assertion instead, since failure would indicate that a pre-condition doesn't hold. Since I know too little about the problem, I did something obviously wrong and accompanied it by that comment, hoping it would be obvious that this needs more consideration. But maybe you're right. _"Do as I say, not as I do"_ usually doesn't very work well. What do you suggest?

sbi 2010-07-25 18:51:08

@sbi: I just meant that as a joke :-) (Then I decided it wasn't funny so I deleted it) As for my opinion, in this scenario, I tend to throw a `logic_error` or something derived therefrom.

James McNellis 2010-07-25 19:15:23

@James: No offense taken, but I am always wary of posting code that's actually wrong or does things one should not do. As I said _"Do as I say, not as I do"_ doesn't work, I know that from teaching.

sbi 2010-07-26 00:46:21

I think that the time complexity of this solution could be higher than you might like. Each comparision is O(log n) rather than O(1), so the total time is O(n*(log n)^2) instead of the usual O(n*log(n)). This might not matter, but if it does then using a std::unordered_map rather than a std::map would be better.

Richard Wolf 2010-07-26 06:50:57

@Richard: That's a good point. I'd be very interested in a better solution should someone find one.

sbi 2010-07-26 08:02:26

How about 1) Create a map of id to pointer; 2) clear the original list; 3) for each id in id list, push back the pointer by looking up its id in the map. Isn't that O(n*log(n))?

AshleysBrain 2010-07-26 21:26:41

@sbi: My suggestion was to use std::unordered_map which gives constant time access to elements in the best case. Just make sure that the hash function in use is appropriate.

Richard Wolf 2010-07-26 22:40:13

@Richard: I understood that you suggested using a hash map. `:)` I was thinking of reducing memory usage and the first n in O(n*log n^2). Can't we get away without creating that map? For example, assuming we could change `my_data_ids` into a heap, we would at least reduce the memory footprint. And better data locality in a `std::vector` (compared with a `std::map`) practically might improve the algorithm despite all the theoretical O-ness. I was fishing for improvements like this.

sbi 2010-07-27 07:14:10

ansaurus

tags:

views:

answers:

Order a container by member with STL

related questions