tags:

views:

181

answers:

8

Can a STL map be used for keys of varying sizes?

I don't have code for this. I'm still trying to figure out if this can be done and hence my question. (I'm the type that can spend too much time on an impossible problem. I'm hoping to learn from your wisdom).

I am working on a look up table that essentially has two keys. A numeric type key and a type specific secondary key.

For example the first level key is an enumeration:

enum key_type {
  E_ERROR = 0,
  E_INT   = 1,
  E_CHAR  = 2,
  E_STR   = 3,
}; // Yes I know you don't HAVE to specify the values for the enumeration

And then the secondary keys depend upon the key_type. The secondary key for an E_INT is an integer, for an E_CHAR is a character, etc.

Key: E_INT
2ndary Key Examples: 1, 2, 3, 4

Key: E_CHAR
2ndary Key Examples: 'a', 'b', 'c', 'd'

Key: E_STR
2ndary Key Examples: "abc", "xyz", "pdq", "jrr"

My first reaction was to make this an array of map pointers. The first level key is used as the index into the array. Where the array index points at a map that supports the secondary key type.

+--------+
| E_INT  |------------------------------>+------------------+
+--------+                               | MAP with INT key |
| E_CHAR |---------------\               +------------------+
+--------+                \
| E_STR  |------\          \---->+-------------------+
+--------+       \               | MAP with CHAR key |
                  \              +-------------------+
                   \
                    \------>+------------------+
                            | MAP with STR key |
                            +------------------+

I know I can get the above to work, but I was thinking I could combine the two keys and have a single map, with a custom sort() algorithm to deal with the combined keys.

Am I completely nuts for thinking of this? If its not nuts, do you have any suggestions on how to proceed with this?

Off the top of my head I need to make an inherited class for the key where the base class provides a pure virtual function for the sort method, and then have inherited key classes for the E_INT, E_CHAR and E_STR, that implement the sort() method for their usage. Then I would use the base key class as the key for the map.

Comments?


EDIT 8/13/2010

I've been trying some solutions posed, as well as my original thoughts. I keep hitting problems. I did stumble across another stackoverflow article that mentioned type erasure which might do the trick for my differing keys.


EDIT 8/16/2010

Added an answer in the answer section below that shows the coded solution I implemented.

+2  A: 

I think your approach is correct in terms of what you would do with the custom keys.

That said, if you can spare the overhead of having N maps vs. one map with a custom key, I would say do that since it's trivial and is quick. You can even lazy load the maps and just hide the implementation behind another class.

EDIT: your custom comparator should be simple as well. You could strictly order by enum first, and then for keys with the same enum value (CHAR, INT, STR, etc.) you should then just compare by the value. This would guarantee the ordering required by the std::map.

SB
I want to make sure I understand your answer correctly. I believe you are saying if I can afford the multiple maps, then just do that, since its quick and easy. So it sounds like this is a case where I got too excited about what C++ can do and I tried to do too much ;)
John Rocha
yes that's correct.
SB
+4  A: 

std::map requires strict weak ordering for the keys. If you can enforce single order on your different key types with custom comparator then it shouldn't be a problem.

Nikolai N Fetissov
A: 

will you ever have to sort all of the lists together? If so, then your approach will be a pain when you try to access elements.

One stupid way to do it is to simply create a union type and a comparison function on the union:

typedef struct IntRecord {
    key_type key;
    int record;
};

typedef struct CharRecord {
    key_type key;
    char record;
};

typedef struct StrRecord {
    key_type key;
    const char * record;
};

typedef struct Base {
    key_type key;
};
typedef union Record {
    Base b;
    IntRecord i;
    CharRecord c;
    StrRecord s;
};

now your comparison function can look at each record's b.key to see what type should be used.

foobar
Prefer `boost::variant` over `union` whenever possible for the typesafety it gives.
Matthieu M.
+1  A: 

You will need to wrap your two keys into a single object. Your new class will also need to be comparable. eg:

struct myKey
{
  int key1;
  std::string key2;

  bool operator<(const myKey&) const;
}; 

There is nothing stopping key1 and key2 being (smart) pointers. Once an object (like myKey) is comparable via the < operator, it can be used in a map.

doron
It need not be comparable via `operator<`. You can specialize `std::less` for your type, or provide a custom compare function as well.
Dennis Zickefoose
A: 

You could implement a class for the key that implements a < operator, something like this (untested):

struct UnionMapKey {
    int key_type;
    union {
        Error *err; // maybe pointer because complex types can't be in C unions
        int i;
        char c;
        string *s; // pointer because complex types can't be in C unions
    };
    UnionMapKey(const string &stringContent) { key_type = E_STR; s = new string(stringContent); }
    // other constructor overrides
    bool operator<(const UnionMapKey &rhs) {
        if (key_type != rhs.key_type) return key_type < rhs.key_type;
        if (key_type == E_ERROR) return err < rhs.err;
        // etc.
    }
    ~UnionMapKey() {
        if (key_type == E_STR) delete s;
    }
};

You probably also need a copy constructor, but I think you get the idea. Once you iron out the wrinkles, this will work fine as a map key. Honestly though, your array-of-maps solution is probably much simpler if it works for you.

Walter Mundt
Prefer `boost::variant` over `union` whenever possible for the typesafety it gives. You also have memory leaks because of the absence of a correct assignment operator...
Matthieu M.
Actually, the above will probably crash because it will do a byte-by-byte copy and then double-delete the s pointer. I did note that the implementation was incomplete, though.
Walter Mundt
+2  A: 

While there's been many solutions presented... none of them is as elegant as:

typedef boost::variant<int,char,std::string> key_type;

typedef std::map<key_type, value_type> map_type;

Yep, that's all. A boost::variant is naturally ordered first by type and second (when types are equal) by the values they carry (if they can be compared).

I challenge anyone to find a simpler solution ;)

Matthieu M.
Simpler? Introducing `boost::variant_map`!
GMan
I looked into this but sadly this has a limit as to the number of entities that can be in the variant. On my system that limit is 20. And while my prototype above is very simple and small for understanding the problem, we will have more than 20 level 1 keys.
John Rocha
Oh my... more than 20 seems a lot. Have you tried redefining `BOOST_VARIANT_LIMIT_TYPES` ? If you redefine this macro you'll benefit from the possibility of using more than 20 elements. Even though for so many elements I would be tempted to go down the abstraction road because I wouldn't want to interleave so many dependencies.
Matthieu M.
Yes. More than 20, although I'm working with the other team to try and reduce this count via consolidation. I determined that I had a limit of 20 based on the value of BOOST_VARIANT_LIMIT_TYPES. But your reaction makes me think that I can change that and then recompile the library? I'll keep that in mind, but so far my abstraction method seems to do the trick. I'm still dreaming up scenarios to throw at it and break it.
John Rocha
You don't need to recompile, it's header only, you can just put a `#define BOOST_VARIANT_LIMIT_TYPES 40` before the inclusions. The limit is at 20 because preprocessor macros are used to generate all the variants thus the higher the limit is the longer its takes to preprocess (and then compile). Of course C++0x (because of variadic templates) will lift this issue and there won't be a limit any longer... but for 20 or more types I am afraid your dependency scheme would suffer too much if you were to concentrate them here, I thought you only had a handful ;)
Matthieu M.
+1  A: 

If you use boost::Any in combination with a compare operator like in http://www.sgi.com/tech/stl/Map.html. You should be able to use multiple types as key, as long as your operator can order them. If you use boost::Any they will occupy more space than the key itself, but the rest of the solutions shown here will also impose a bit of overhead.

Jesper Madsen
I read about boost::Any during my type erasure readings. Although I was able to get a working prototype without it. Thanks for the suggestion.
John Rocha
A: 

The Solution I Implemented


The consensus was that it could be done. I sort of adapted on the type erasure concept, along with suggestions from others above. I have something that works now. The map key has to be an object that has a pointer to a polymorphic key object.

I tried having just the base object as the key type but when the map creates its copy of the key, it looked like it was just copying the base class.

So I naively switched to a pointer (key_base_c *). However, this then just did pointer comparison. My sorting wasn't even used.

After reading the type erasure information. I adapted my pointer solution by placing it inside of a multi_key_c object which forwarded its <, == and strIdx() calls to the key_base_c pointer I hid inside of it.

After writing a couple of derived classes, I quickly saw that this lent itself to being a template and my solution quickly fell into place.

I suspect there may be better ways to implement this, but here is what I have so far:

#include <map>
#include <sstream>
#include <iostream>
#include <utility>



//
// list of types to act as the primary key. The primary key dicatates the
// format of the secondary key.
//
enum e_types {
    E_ERROR  = 0,
    E_INT    = 1,
    E_CHAR   = 2,
    E_STR    = 3,
};





// Base class for the multi-key.
class key_base_c {

public:

    key_base_c (enum e_types    key_type) :
        key1(key_type)
        {};


    virtual ~key_base_c(void) {};


    virtual std::string strIdx (void) const {
        std::stringstream    ss_idx;

        ss_idx << key1;
        return (ss_idx.str());
    }


    virtual bool operator< (const key_base_c    &b) const {
        return (key_base_c::operator<(&b));
    }


    virtual bool operator< (const key_base_c    *p) const {
        return (key1 < p->key1);
    }


    virtual bool operator== (const key_base_c    &b) const {
        return (key_base_c::operator==(&b));
    }


    virtual bool operator== (const key_base_c    *p) const {
        return (key1 == p->key1);
    }


protected:

    e_types    key1;   // the primary key

};





// template    policy_key_c
//
// EVENT_TYPE_VAL    -  select the enumerated value to use for key1's value
//
// KEY2_TYPE         -  select the class to use for the second key. For built
//                      in types they use their default < and == operators,
//                      If a private custom type is specified then it must
//                      have its own < and == operators specified
//
template <enum e_types    EVENT_TYPE_VAL,
          class           KEY2_TYPE>
class policy_key_c : public key_base_c
{
public:

    policy_key_c (KEY2_TYPE    key_value) :
        key_base_c(EVENT_TYPE_VAL),
        key2(key_value)
        {};


    virtual ~policy_key_c(void) {};


    // return the index as a string.
    virtual std::string strIdx (void) const {
        std::stringstream    ss_idx;

        ss_idx << key_base_c::strIdx() << "." << key2;
        return (ss_idx.str());
    }


    //
    // operator <
    //
    virtual bool operator< (const key_base_c    &b) const {
        return (operator<(&b));
    }


    virtual bool operator< (const key_base_c    *p) const {

        // if the primary key is less then it's less, don't check 2ndary
        if (key_base_c::operator<(p)) {
            return (true);
        }


        // if not less then it's >=, check if equal, if it's not equal then it
        // must be greater
        if (!(key_base_c::operator==(p))) {
            return (false);
        }


        // primary keys are equal, so now check the 2ndary key. Since the
        // primary keys are equal then that means this is either a key_base_c
        // object or its a policy_key_c object.
        const policy_key_c    *p_other = dynamic_cast<const policy_key_c*>(p);


        // if NULL then it was a key_base_c, and has no secondary key, so it is
        // lexigraphically smaller than us, ergo we are not smaller than it.
        if (!p_other) {
            return (false);
        }

        return (key2 < p_other->key2);
    }



    //
    // operator ==
    //
    virtual bool operator== (const key_base_c    &b) const {
        return(operator==(&b));
    }


    virtual bool operator== (const key_base_c    *p) const {

        // if the primary key isn't equal, then we're not equal
        if (!(key_base_c::operator==(p))) {
            return (false);
        }


        // primary key is equal, so now check the secondary key. Since the
        // primary keys are equal, then that means this is eitehr a key_base_c
        // object or its a policy_key_c object.
        const policy_key_c    *p_other = dynamic_cast<const policy_key_c*>(p);

        // if NULL then it was a key_base_c
        if (!p_other) {
            // why? If the LHS is a key_base_c it doesn't go any deeper than
            // the base. Hence we don't either.
            return (true);
        }

        return (key2 == p_other->key2);
  }


protected:

    KEY2_TYPE    key2;    // The secondary key.

};



class multi_key_c {
public:
    multi_key_c (key_base_c    *p) :
        p_key(p)
        {};


    ~multi_key_c(void) {};


    bool operator< (const multi_key_c    &mk) const {
        return (p_key->operator<(mk.p_key));
    }


    bool operator== (const multi_key_c    &mk) const {
        return (p_key->operator==(mk.p_key));
    }


    std::string strIdx (void) const {
        return (p_key->strIdx());
    }

protected:
    key_base_c    *p_key;
};







// DO_TEST(x, op, y)
//    x, y: can be any derived key type
//    op  : The operation to do < or ==
//
//    Prints the operation being done along with the results of the operation
//    For example:
//        DO_TEST(a, <, b)
//    will print:
//        a < b: <results>
//
//    where <results> are the results of the operation 'a < b'
#define DO_TEST(x, op, y, expect)                                             \
{                                                                             \
    bool    retval = x op y;                                                  \
    std::cout << #x " " #op " " #y ": " << retval                             \
              << " = " << ((retval == expect) ? "pass" : "----FAIL") << "\n"; \
}





template <class C>
void
print_them (C              **pp_c,
            int            count,
            std::string    s_type)
{
    int    idx;

    std::cout << "\n" << count << " keys for " << s_type << "\n";

    for (idx = 0 ; idx < count; ++idx) {
        std::cout << "    " << (*pp_c)->strIdx() << "\n";
        pp_c++;
    }
}






int
main (void)
{
    std::cout << "\nBASE\n";

    key_base_c    base_error(E_ERROR), base_int(E_INT), base_char(E_CHAR);
    key_base_c    base_str(E_STR);

    key_base_c    *key_base_array[] = {
        &base_error, &base_int, &base_char, &base_str
    };


    print_them(key_base_array,
               (sizeof(key_base_array) / sizeof(key_base_array[0])),
               "key_base_c");

    DO_TEST(base_error, < , base_error,  false);
    DO_TEST(base_error, < , base_int,    true);
    DO_TEST(base_int,   < , base_char,   true);
    DO_TEST(base_char,  < , base_str,    true);

    std::cout << "\n";
    DO_TEST(base_error, ==, base_error,  true);
    DO_TEST(base_int,   ==, base_int,    true);
    DO_TEST(base_char,  ==, base_char,   true);
    DO_TEST(base_str,   ==, base_str,    true);

    std::cout << "\n";
    DO_TEST(base_error, ==, base_int,    false);
    DO_TEST(base_int,   ==, base_char,   false);
    DO_TEST(base_char,  ==, base_str,    false);




    // INT
    //
    typedef policy_key_c<E_INT, int>    key_int_2;

    key_int_2    i1(1), i2(2), i3(3), i4(4);
    key_int_2    *key_int2_array[] = {
        &i1, &i2, &i3, &i4,
    };


    print_them(key_int2_array,
               (sizeof(key_int2_array) / sizeof(key_int2_array[0])),
               "key_int_2");

    DO_TEST(base_int,     < , i1,          false);
    DO_TEST(i1,           < , base_int,    false);

    DO_TEST(i1,           < , base_char,   true);
    DO_TEST(base_char,    < , i1,          false);

    DO_TEST(i1,           ==, i1,          true);
    DO_TEST(i1,           ==, base_int,    true);
    DO_TEST(base_int,     ==, i1,          true);
    DO_TEST(i1,           ==, base_error,  false);
    DO_TEST(base_error,   ==, i1,          false);


    std::cout << "\n";
    DO_TEST(i1,   < , i2, true);
    DO_TEST(i1,   < , i3, true);
    DO_TEST(i1,   < , i4, true);



    // CHAR
    typedef policy_key_c<E_CHAR, char>    key_char_c;


    key_char_c    c1('a'), c2('b'), c3('c'), c4('d');
    key_char_c    *key_char_array[] = {
        &c1, &c2, &c3, &c4,
    };

    print_them(key_char_array,
               (sizeof(key_char_array) / sizeof(key_char_array[0])),
               "key_char");


    DO_TEST(base_int,      < , c1,      true );
    DO_TEST(base_int,      ==, c1,      false);
    DO_TEST(base_char,     < , c1,      false);
    DO_TEST(base_char,     ==, c1,      true );
    DO_TEST(base_str,      < , c1,      false);
    DO_TEST(base_str,      ==, c1,      false);

    std::cout << "\n";
    DO_TEST(c1,            < , c1,      false);
    DO_TEST(c1,            ==, c1,      true );
    DO_TEST(c1,            < , c2,      true );
    DO_TEST(c1,            ==, c2,      false);

    std::cout << "\n";
    DO_TEST(c1,            ==, i1,      false);
    DO_TEST(i1,            ==, c1,      false);
    DO_TEST(c1,            < , i1,      false);
    DO_TEST(i1,            < , c1,      true );



    // STR
    typedef policy_key_c<E_STR, std::string>    key_str_c;


    key_str_c    s1("aaa"), s2("bbb"), s3("ccc"), s4("ddd");
    key_str_c    *key_str_array[] = {
        &s1, &s2, &s3, &s4
    };

    print_them(key_str_array,
               (sizeof(key_str_array) / sizeof(key_str_array[0])),
               "key_str");

    DO_TEST(base_int,     < , s1,         true );
    DO_TEST(base_char,    < , s1,         true );
    DO_TEST(base_str,     < , s1,         false);
    DO_TEST(base_str,     ==, s1,         true );
    DO_TEST(s1,           < , base_int,   false);
    DO_TEST(s1,           < , base_char,  false);
    DO_TEST(s1,           < , base_str,   false);
    DO_TEST(s1,           ==, base_str,   true);


    std::cout << "\n";
    DO_TEST(s1,            < , s1,      false);
    DO_TEST(s1,            ==, s1,      true );
    DO_TEST(s1,            < , s2,      true );
    DO_TEST(s1,            ==, s2,      false);



    std::cout << "\n\nNOW TESTING THE MAP\n\n";

    typedef std::multimap<multi_key_c, std::string>    multiKeyMap;

    multiKeyMap    myMap;


    multi_key_c    k1(&i1),  k2(&i2),  k3(&i3),  k4(&i4);
    multi_key_c    k5(&c1),  k6(&c2),  k7(&c3),  k8(&c4);
    multi_key_c    k9(&s1),  k10(&s2), k11(&s3), k12(&s4);


    myMap.insert(std::make_pair(k1, "one"));
    myMap.insert(std::make_pair(k2, "two"));
    myMap.insert(std::make_pair(k3, "three"));
    myMap.insert(std::make_pair(k4, "four"));
    myMap.insert(std::make_pair(k1, "one.2"));
    myMap.insert(std::make_pair(k4, "four.2"));

    myMap.insert(std::make_pair(k5, "c1"));
    myMap.insert(std::make_pair(k5, "c1.2"));
    myMap.insert(std::make_pair(k6, "c2"));
    myMap.insert(std::make_pair(k6, "c2.2"));
    myMap.insert(std::make_pair(k7, "c3"));
    myMap.insert(std::make_pair(k8, "c4"));


    myMap.insert(std::make_pair(k9,  "s1"));
    myMap.insert(std::make_pair(k10, "s2"));
    myMap.insert(std::make_pair(k11, "s3"));
    myMap.insert(std::make_pair(k12, "s4"));
    myMap.insert(std::make_pair(k12, "s4.2"));
    myMap.insert(std::make_pair(k11, "s3.2"));
    myMap.insert(std::make_pair(k10, "s2.2"));
    myMap.insert(std::make_pair(k9,  "s1.2"));

    multiKeyMap::iterator    pos;

    for (pos = myMap.begin(); pos != myMap.end(); ++pos) {
        std::cout << pos->first.strIdx() << " : " << pos->second
                  <<"\n";
    }


    return (0);
}

Output from this is:

BASE

4 keys for key_base_c
    0
    1
    2
    3
base_error < base_error: 0 = pass
base_error < base_int: 1 = pass
base_int < base_char: 1 = pass
base_char < base_str: 1 = pass

base_error == base_error: 1 = pass
base_int == base_int: 1 = pass
base_char == base_char: 1 = pass
base_str == base_str: 1 = pass

base_error == base_int: 0 = pass
base_int == base_char: 0 = pass
base_char == base_str: 0 = pass

4 keys for key_int_2
    1.1
    1.2
    1.3
    1.4
base_int < i1: 0 = pass
i1 < base_int: 0 = pass
i1 < base_char: 1 = pass
base_char < i1: 0 = pass
i1 == i1: 1 = pass
i1 == base_int: 1 = pass
base_int == i1: 1 = pass
i1 == base_error: 0 = pass
base_error == i1: 0 = pass

i1 < i2: 1 = pass
i1 < i3: 1 = pass
i1 < i4: 1 = pass

4 keys for key_char
    2.a
    2.b
    2.c
    2.d
base_int < c1: 1 = pass
base_int == c1: 0 = pass
base_char < c1: 0 = pass
base_char == c1: 1 = pass
base_str < c1: 0 = pass
base_str == c1: 0 = pass

c1 < c1: 0 = pass
c1 == c1: 1 = pass
c1 < c2: 1 = pass
c1 == c2: 0 = pass

c1 == i1: 0 = pass
i1 == c1: 0 = pass
c1 < i1: 0 = pass
i1 < c1: 1 = pass

4 keys for key_str
    3.aaa
    3.bbb
    3.ccc
    3.ddd
base_int < s1: 1 = pass
base_char < s1: 1 = pass
base_str < s1: 0 = pass
base_str == s1: 1 = pass
s1 < base_int: 0 = pass
s1 < base_char: 0 = pass
s1 < base_str: 0 = pass
s1 == base_str: 1 = pass

s1 < s1: 0 = pass
s1 == s1: 1 = pass
s1 < s2: 1 = pass
s1 == s2: 0 = pass


NOW TESTING THE MAP

1.1 : one
1.1 : one.2
1.2 : two
1.3 : three
1.4 : four
1.4 : four.2
2.a : c1
2.a : c1.2
2.b : c2
2.b : c2.2
2.c : c3
2.d : c4
3.aaa : s1
3.aaa : s1.2
3.bbb : s2
3.bbb : s2.2
3.ccc : s3
3.ccc : s3.2
3.ddd : s4
3.ddd : s4.2
John Rocha