tags:

views:

160

answers:

5

Say I have a high score table structured like

name score
name score
....

I need to do some file operations and manipulate certain areas of the file, and I thought the best way to do this would be to store it in a container that preserved the order of the file, do the data manipulation with the container, then output back to the file.

I considered using a map< std::string, int >, but a map wouldn't preserve the order of the file. Would a vector< pair< std::string, int >> be better, or is there some kind of ordered map I can use? I also need the container to repeat a name if necessary. I think a multimap keeps one key, but allows multiple values for that key, which isn't what I want since it wouldn't preserve order.

A: 

I also need the container to repeat a name if necessary.

So stick the same name into your vector-of-pairs twice. Is that really such a big deal?

Anon.
No. I'm just wondering if it's the best option....
Anonymous
What you're doing is getting into the realm of *premature optimization*. It doesn't matter whether it's the abstractly "best" option, what matters is that it *gets the job done* and that it's *good enough*.
Anon.
Asking for the right container to use is the opposite of premature optimization.
GMan
I think wondering what's the best container for the job is a valid question.
Anonymous
In this case, it's clear that some sort of ordered collection is necessary - when you start worrying about details like "what if I have multiples of the same name", then you're into premature optimization.
Anon.
I'm worried about multiples because I need data that says the same thing to be stored separately. Not because of optimization.
Anonymous
Anon., premature optimization is "Should I use `i++` or `++i` or `i+=1`?" All he's asking for is the cleanest way to implement what he wants. I think you're incorrect when you say "It doesn't matter whether it's the abstractly "best" option, what matters is that it gets the job done and that it's good enough." When we're talking about **abstract design**, how is what is best in **abstract design** not relevant? "May not be best, but it works." is a terrible mindset.
GMan
There is no "best" in abstract design.
Anon.
I probably agree, but this is all irrelevant. Asking about an abstract design choice is nowhere near premature optimization.
GMan
+5  A: 

Use the

std::vector< std::pair< std::string, int > >

solution.

Or even better, do a HighScoreEntry class, and have a std::vector< HighScoreEntry > -- then you'll be able to add additional data to it later, and embed score handling code in the class.

To add an entry, push the entry to the end of the vector, and run std::sort on it, just remember to write a comparison operator for HighScoreEntry.

class HighScoreEntry
{
public:
   std::string name;
   uint32 score;

   bool operator<(const HighScoreEntry& other) const
   {
       // code that determines ordering goes here
       return score < other.score
   }
};

Highscore table:

std::vector< HighScoreEntry > highscores;

Sorting:

std::sort( highscores.begin(), highscores.end() );

To preserve sort order for highscores that have the same score, use a timestamp, and add it to the comparator.

Kornel Kisielewicz
I actually have something similar to that, thanks for the advice.
Anonymous
I think better would be a (multi)map. It sorts itself.
GMan
I need it to stay in the same order as the file though. I.e. it is read in in the same order as it is read out.
Anonymous
I'm not sure I understand. If you're saving this to a file, then just iterate through whichever container you end up using and save it's value. When you read, take each pair and reinsert it into container.
GMan
Right, but if a multimap sorts itself. It will sort the info in a different order.
Anonymous
If you want to take optimisation to the nth degree, use `std::partial_sort` instead of `std::sort`, which requires only the first N entries (where N is the number of entries you want to save) to be sorted. :-P
Chris Jester-Young
@Chris - multimap has some disdvantages in this regard, the major one is that indexing would be done via iterator, which for some people is still troublesome. The gain of having it sorted faster (what multimap would ultimately achieve) is little (considering the performance impact of a highscore table), while the need of handling the interface of the multimap is a readability concern.
Kornel Kisielewicz
If you actually were optimising, then use `lower_bound` or `upper_bound` and `insert`, not `push_back` and `sort` or `partial_sort`. And of course trim after adding to keep the sizes down. After all, we might some day want to store and display the top 10 million scores ;-)
Steve Jessop
@Kornel: Oh, I agree, which is why I said in my comment that I like your vector/(partial) sort solution too. :-)
Chris Jester-Young
@Steve, we might read them over the internet too! ;-)
Kornel Kisielewicz
@Steve: I was quite amused by your comment, but isn't `insert` O(n) for vector (where n is the distance of the insertion point from the end of the vector), if it's not at the end? That would argue for a insert-first, sort-afterwards approach.
Chris Jester-Young
Yes, but depending on algorithm `sort` is likely O(n log n) for the case of one out-of-order element at the end, whereas my suggestion (i.e. insertion sort) is O(log n + n) = O(n). If you're adding a whole bunch of items at once you'd stick them on the end and sort (n log n for the lot, instead of n * m inserting them one at a time). I'm assuming that we only ever sort because we're adding one new high score - the storage file will naturally be sorted anyway, so doesn't need to be sorted on load. If it was sorted on load, though, that should of course be with `sort`.
Steve Jessop
@Steve: Yes, fair call, it does totally depend on the usage scenario. +1
Chris Jester-Young
+2  A: 
typedef std::pair<int, int> score_entry; // score, timestamp/serial
typedef std::map<score_entry, std::string, std::greater<score_entry> > score_map;

It's ordered by the score and timestamp/serial (in descending order), and allows duplicates of the same high score. (If you want earlier timestamp/serial to be listed first, put in negative numbers.)

Using a serial number instead of a timestamp means that you can allow duplicates without having to use multimap. Thanks to Steve Jessop for the suggestion!

Chris Jester-Young
Now that's a short and readable container definition :>
Kornel Kisielewicz
Well done sir, that's even sillier and more confusing than the thing I dismiss as silly in my answer. +1.
Steve Jessop
Thanks! I edited it to use `typedef`s, so it's even more readable. :-)
Chris Jester-Young
Come to think of it, if you can ensure your timestamp is strictly increasing, (sufficient would be if it's not possible to play the game in less than a tick, or if the timestamp is really just an incrementing counter rather than a literal time), then you don't need a multimap.
Steve Jessop
@Steve: Yes, you're right. Okay, I'll edit my entry appropriately. (I also like Kornel's idea of using a vector and a (partial) sort afterwards.)
Chris Jester-Young
A: 

If you want an ordered map, then you want it ordered by score, so it should be a multimap<int, string>, to preserve order and permit tied scores.

This strikes me as silly, though, not to mention it's not obvious what on earth is being "mapped". Container performance on a high-score table is very unlikely ever to matter, so I'd just use the vector of pair<string,int>.

Steve Jessop
A: 

Why use a priority_queue? Should make adding new scores easy as well.

ergosys
Sure you can. I personally would use boost::array to map iterators of a spread linked list. However, sometimes it's just good to KISS (keep it simple...).
Kornel Kisielewicz
How do you intend to iterate over it, BTW?
Kornel Kisielewicz
@Kornel, good point about the iteration. I forgot about that limitation.
ergosys