views:

70

answers:

4

I'm using a type Id which is defined in another part of the code I'm using:

typedef int Id;

Now I am provided many objects, each of which comes with such an Id, and I would like to use the Id as index into a std::vector that stores these objects. It may look something like this:

std::vector<SomeObj*> vec(size);
std::pair<Id, SomeObj*> p = GetNext();
vec[p.first] = p.second;

The problem is that std::vector uses its own type to index its elements: std::vector::size_type (why isn't that templated?).

So strictly, it would be better to use std::map<Id, SomObj*>, but that would be less efficient, and an array is what I really need here (I know that the indices of all the objects are contiguous and start with 0). Another problem is that the typedef int Id might change to typedef long int Id or similar in the future ... (it is part of my own code though, so I control it, but ideally I should be allowed to change the typedef at some points; that's what a typedef is for).

How would you deal with this? Maybe use unordered_map<Id, SomeObj*>, where the hash function directly uses the Id as hash key? Would that be less memory-efficient? (I don't completely understand how unordered_map allocates its space given that the range of the hash function is unknown in advance?)

+3  A: 

You can pass any integral type as an index for std::vector. If it doesn't match std::vector<T>::size_type (which is typically unsigned long) then the value will be implicitly converted.

Anthony Williams
I am 90% certain that `vector<T>::size_type` is *not* an `unsigned long`. Instead, it will almost always be a `std::size_t` which is twice as large as a `long` on some computers.
Konrad Rudolph
Yes, `size_type` is probably `size_t`. However, that is typically `unsigned long`. It is on my 64-bit linux system with gcc, and on my 32-bit Windows system.
Anthony Williams
@Anthony: 64 bit Windows is atypical, since it's LLP64, whereas 64 bit linux is LP64.
Steve Jessop
@Steve: Good point; on 64-bit Windows, `size_t` is probably `unsigned __int64`
Anthony Williams
+2  A: 

Write your own container wrapper that takes Id as the index type. Use map or unordered_map internally to implement the container. Program against this wrapper. If it turns out that this implementation is too slow, switch to vector internally and convert you Id index to vector::size_type (also internally, of course).

This is the cleanest approach. But really, vector::size_type will be some very large unsigned integer type so a conversion from Id to vector::size_type will always be safe (but not the other way round!).

Konrad Rudolph
+1  A: 

why isn't that templated?

Because standard containers are implemented to use the largest unsigned type that they reasonably can. If size_type is unsigned int in your implementation then that's for a reason, and whatever the reason is that prevented the implementers using a larger type, would still exist if you asked for something else[*]. Also, for your particular example, size types have to be unsigned, and you want to use a signed type, so that's another change that would be needed to support what you want to do.

In practice, size_type for a standard container is (almost?) always size_t. So if you asked for a larger vector, you couldn't have one, because a vector is backed by contiguous storage. The vector would be unable to allocate an array bigger than size_t bytes.

To use your Id as a vector index, you can either rely on implicit conversions, or you can explicitly convert (and, perhaps, explicitly bounds check) in order to make it absolutely clear what you are doing. You could also use asserts to ensure that Id is no bigger than size_type. Something like this, although a static assert would probably be better:

assert(std::numeric_limits<Id>::max() <= std::numeric_limits<std::vector<SomeObj*>::size_type>::max());

A map<Id, SomeObj*> would be a good option if the Id values in use are sparse. If the only valid Ids are 1 and 400,000,000, then a vector would be rather wasteful of memory.

If it makes you feel any more comfortable, remember that the literal 0 has type int, not vector<SomeObj*>::size_type. Most people have no qualms about writing vec[0]: indeed it's used in the standard.

[*] Even if that reason is just, "the implementers think that 4 billion elements is enough for anyone".

Steve Jessop
A: 

The problem is, Id is an int (more accurately, signed int), can be negative. If the Id was an unsigned type, e.g. typedef unsigned int Id;, there is no problem.

If my understanding correct so far, then I don't get the idea why would someone want to use a negative number as an index to vector (or array). What am I missing?

ArunSaha