views:

238

answers:

3

In the following example a std::map structure is filled with 26 values from A - Z (for key) and 0 - 26 for value. The time taken (on my system) to lookup the last entry (10000000 times) is roughly 250 ms for the vector, and 125 ms for the map. (I compiled using release mode, with O3 option turned on for g++ 4.4)

But if for some odd reason I wanted better performance than the std::map, what data structures and functions would I need to consider using?

I apologize if the answer seems obvious to you, but I haven't had much experience in the performance critical aspects of C++ programming.

#include <ctime>
#include <map>
#include <vector>
#include <iostream>

struct mystruct
{
    char key;
    int value;

    mystruct(char k = 0, int v = 0) : key(k), value(v) { }
};

int find(const std::vector<mystruct>& ref, char key)
{
    for (std::vector<mystruct>::const_iterator i = ref.begin(); i != ref.end(); ++i)
        if (i->key == key) return i->value;

    return -1;
}

int main()
{
    std::map<char, int> mymap;
    std::vector<mystruct> myvec;

    for (int i = 'a'; i < 'a' + 26; ++i)
    {
        mymap[i] = i - 'a';
        myvec.push_back(mystruct(i, i - 'a'));
    }

    int pre = clock();

    for (int i = 0; i < 10000000; ++i)
    {
        find(myvec, 'z');
    }

    std::cout << "linear scan: milli " << clock() - pre << "\n";

    pre = clock();

    for (int i = 0; i < 10000000; ++i)
    {
        mymap['z'];
    }

    std::cout << "map scan: milli " << clock() - pre << "\n";

    return 0;
}
+7  A: 
KennyTM
Good answer, I reiterate. Unordered map, and then optimizing calls (removing object oriented code helps too... but I may get flamed for saying that :-)
Etamar L.
@Etamar: I don't think removing OO code will help, unless you use `virtual` a lot.
KennyTM
@Etamar: Do you mean get rid of classes? How would that speed anything up?
GMan
+2  A: 

For starters, you should probably use std::map::find if you want to compare the search times; operator[] has additional functionality over and above the regular find.

Also, your data set is pretty small, which means that the whole vector will easily fit into the processor cache; a lot of modern processors are optimised for this sort of brute-force search so you'd end up getting fairly good performance. The map, while theoretically having better performance (O(log n) rather than O(n)) can't really exploit its advantage of the smaller number of comparisons because there aren't that many keys to compare against and the overhead of its data layout works against it.

TBH for data structures this small, the additional performance gain from not using a vector is often negligible. The "smarter" data structures like std::map come into play when you're dealing with larger amounts of data and a well distributed set of data that you are searching for.

Timo Geusch
+2  A: 

If you really just have values for all entries from A to Z, why don't you use letter (properly adjusted) as the index into a vector?:

std::vector<int> direct_map;
direct_map.resize(26);

for (int i = 'a'; i < 'a' + 26; ++i) 
{
    direct_map[i - 'a']= i - 'a';
}

// ...

int find(const std::vector<int> &direct_map, char key)
{
    int index= key - 'a';
    if (index>=0 && index<direct_map.size())
        return direct_map[index];

    return -1;
}
MSN
+1 Because when you have a hammer...
Justicle