views:

929

answers:

9

Hi, I'm pretty sure I already saw this question somewhere (comp.lang.c++? Google doesn't seem to find it there either) but a quick search here doesn't seem to find it so here it is:

Why does the std::map operator[] create an object if the key doesn't exist? I don't know but for me this seems counter-intuitive if you compare to most other operator[] (like std::vector) where if you use it you must be sure that the index exists. I'm wondering what's the rationale for implementing this behavior in std::map. Like I said wouldn't it be more intuitive to act more like an index in a vector and crash (well undefined behavior I guess) when accessed with an invalid key?

Refining my question after seeing the answers:

Ok so far I got a lot of answers saying basically it's cheap so why not or things similar. I totally agree with that but why not use a dedicated function for that (I think one of the comment said that in java there is no operator[] and the function is called put)? My point is why doesn't map operator[] work like a vector? If I use operator[] on an out of range index on a vector I wouldn't like it to insert an element even if it was cheap because that probably mean an error in my code. My point is why isn't it the same thing with map. I mean, for me, using operator[] on a map would mean: i know this key already exist (for whatever reason, i just inserted it, I have redundancy somewhere, whatever). I think it would be more intuitive that way.

That said what are the advantage of doing the current behavior with operator[] (and only for that, I agree that a function with the current behavior should be there, just not operator[])? Maybe it give clearer code that way? I don't know.

Another answer was that it already existed that way so why not keep it but then, probably when they (the ones before stl) choose to implement it that way they found it provided an advantage or something? So my question is basically: why choose to implement it that way, meaning a somewhat lack of consistency with other operator[]. What benefit do it give?

Thanks

+4  A: 

Standard says (23.3.1.2/1) that operator[] returns (*((insert(make_pair(x, T()))).first)).second. That's the reason. It returns reference T&. There is no way to return invalid reference. And it returns reference because it is very convenient I guess, isn't it?

Kirill V. Lyadvinsky
Sorry, maybe my question wasn't clear enought but I'm asking the reason for the standard.
n1ck
Ok, it is convenient that `operator[]` returns reference. So it should insert something, because there is no way to return invalid reference.
Kirill V. Lyadvinsky
Look at Samuel's answer: http://stackoverflow.com/questions/1639544/1639573#1639573
sbi
Why don't they make a const version that throws an exception?
GMan
Because exception is for exceptional cases.
Kirill V. Lyadvinsky
@GMan: If the map's `operator[]` wouldn't create values, it would be a lookup function. And you don't want your lookup function to throw if it doesn't find what you're looking for, do you?
sbi
@sbi: unless you aldready know that it exist? But yeah I just realised that if it worked like I wanted it would basically be a find, thanks.
n1ck
Pretty much along nicks thought. Ever time I've ever used a map for look-up, I've had to end up checking if find returned end, then throw an exception, because it's the only appropriate thing to do.
GMan
@n1ck: If you already know that it exists, then what's wrong with `operator[]` as it is?
sbi
@GMan: Look at my comments to this question: http://stackoverflow.com/questions/1462341/
sbi
because it would assert in debug mode if it didn't instead of just creating an (possibly invalid since second element is possibly default constructed) element without telling me, meaning I have to catch the error later.
n1ck
@n1ck: In debug mode, an additional `assert(my_map.find(key)!=my_map.end());` would do.
sbi
I know but it's more code. I guess I could always add a wrapper class that provide the behavior I want if I really wanted to. What really caused me to ask is that I replaced a vector with a map, counting on the compiler to find me errors (didn't think about map's []) and instead of asserting right when I accessed an invalid index (my key was an int) it crashed later when I tried to use the (invalid) element it had created. It was not the first time I had problem with this behavior of map so I decided to ask the question. Seeing the answers I guess I'm the only one who do. ...
n1ck
Anyway thanks for the discussion it made me found some point I hadn't thought about. Still I stay by my point that this behavior of operator[] is unintuitive compared to vector's.
n1ck
+3  A: 

Its is for assignment purposes:


void test()
{
   std::map<std::string, int >myMap;
   myMap["hello"] = 5;
}
EToreo
If this was the main reason, then a constant map would return a constant reference which cannot be assigned to and `operator[]` for constant maps wouldn't have to create a value. But there isn't even an `operator[]` for constant maps, yo you must be wrong.
sbi
I'm not sure how that makes me wrong... it seems like if there is no <pre>operator[]</pre> on a constant map, it just proves my point.
EToreo
That there is no `operator[]` for constant maps seems to indicate that `operator[]` is there *purely* to modify maps. I somehow must be misunderstanding the point you're making. If the designers of the STL didn't think that an `operator[]` of any kind would make sense for constant maps, they obviously saw the whole meaning of `operator[]` in modifying the map, like with the assignment in the answer...
sth
Having `operator[]` have two vastly different semantics depending on whether the map is `const` or not sounds like a bad idea to me. The way it is, `[]` has certain well-defined semantics which is simply not supported for `const` map - hence it's not there.
Pavel Minaev
sbi
Sounds about right.
EToreo
+1  A: 

The difference here is that map stores the "index", i.e. the value stored in the map (in its underlying RB tree) is a std::pair, and not just "indexed" value. There's always map::find() that would tell you if pair with a given key exists.

Nikolai N Fetissov
+2  A: 

It allows insertion of new elements with operator[], like this:

std::map<std::string, int> m;
m["five"] = 5;

The 5 is assigned to the value returned by m["five"], which is a reference to a newly created element. If operator[] wouldn't insert new elements this couldn't work that way.

sth
see my comment at http://stackoverflow.com/questions/1639544/1639563#1639563 for why I believe this is wrong.
sbi
+8  A: 

Because operator[] returns a reference to the value itself and so the only way to indicate a problem would be to throw an exception (and in general, the STL rarely throws exceptions).

If you don't like this behavior, you can use map::find instead. It returns an iterator instead of the value. This allows it to return a special iterator when the value is not found (it returns map::end) but also requires you to dereference the iterator to get at the value.

R Samuel Klatchko
but doesn't vector return a reference too? what's the difference here? Sorry I'm not familiar with std::map's internal so maybe this is obvious from the internal.
n1ck
It is undefined behaviour to access a vector out of bounds with []. I guess map could invoke some undefined behaviour in this case as well, but would you really like that? With a vector it is easy to find out if an index is valid, with a map it isn't (you could call find first, and in this case it would be pointless to call[] as you should have already found the item you are looking for).
UncleBens
@n1ck: std::vector will return an invalid reference (becuase it is easy todo). std::map does not have an easy way to return a reference.
Martin York
@UncleBens: yes I guess what you say make sense. If operator[] was working like I would find more intuitive, it would basically only be a find. Thanks, I didn't realise that. Still I don't like the way it works. I know that I don't have to use it if I don't like it but I asked this question because I just replaced a vector with a map, counting on the compiler to find me the errors and I got a runtime error because of this behavior. I guess that's what I get for counting on the compiler to find my errors when I could have done better ;o)
n1ck
Traditionally, the interface for maps (in all languages, not just C++) allow you to write something like `map[key] = value;`, and the map would replace the existing value _or_ insert a new one as needed. It's a long standing convention, so it's no surprise that `std::map` follows it. However, since `operator[]` can't tell if the reference it has to return will be used on left side of `operator=` or not, it has to always assume the worst, and therefore insert a new element.
Pavel Minaev
@Pavel Minaev: Thanks I didn't know that other language before c++ implemented it that way. Any exemple?
n1ck
C#, VB, Delphi, Python, Ruby, Eiffel, JavaScript, Lua are among those which specifically have `[]` notation or a precise equivalent (though in all those languages, the map can distinguish between retrieval of value via `[]` and assignment to `[]`, so this issue is non-existent there). Java doesn't have `[]`, but `Map.put` has similar insert-or-replace semantics. In reality, the complete list would probably be several pages old - it's a very old convention. This particular wart with default-initialization is specific to C++ only because there's no good way to distinguish assignment.
Pavel Minaev
Most of these language are not older than c++ so the decision cannot come from doing like them. But it's ok I'm sure you're right I was just curious which language could have inspired c++ for that decision.
n1ck
@N1ck: They might be older that the std library, though. `:)`
sbi
+6  A: 

To answer your real question: there's no convincing explanation as to why it was done that way. "Just because".

Since std::map is an associative container, there's no clear pre-defined range of keys that must exist (or not exist) in the map (as opposed to the completely different situation with std::vector). That means that with std::map, you need both non-insering and inserting lookup functionality. One could overload [] in non-inserting way and provide a function for insertion. Or one could do the other way around: overload [] as an inserting operator and provide a function for non-inserting search. So, someone sometime decided to follow the latter approach. That's all there's to it.

If they did it the other way around, maybe today someone would be asking here the reverse version of your question.

AndreyT
I guess this is what I wanted to know, Thanks. Well a third way would have been not to provide an operator[] and provide lookup and insertion in two member function (no operator overloading).
n1ck
@n1ck: The [] notation is pretty well fixed in people's minds. It would have been theoretically possible to not use it, but I don't think that was very likely.
David Thornley
The reason given above by Pavel Minaev is pretty convincing.
Stuart
Which one? That this is a "long standing convention"? Might be true, but not convincing. That there's no way to know whether it is on the left-hand side of assignment? This is not convincing at all, since it could just throw an exception or claim undefined behavior. Remember, the original question was why it is different from `std::vector`.
AndreyT
+1  A: 

The answer is because they wanted an implementation that is both convenient and fast.

The underlying implementation of a vector is an array. So if there are 10 entries in the array and you want entry 5, the T& vector::operator[](5) function just returns headptr+5. If you ask for entry 5400 it returns headptr+5400.

The underlying implementation of a map is usually a tree. Each node is allocated dynamically, unlike the vector which the standard requires to be contiguous. So nodeptr+5 doesn't mean anything and map["some string"] doesn't mean rootptr+offset("some string").

Like find with maps, vector has getAt() if you want bounds checking. In the case of vectors, bounds checking was considered an unnecessary cost for those who did not want it. In the case of maps, the only way not to return a reference is to throw an exception and that was also considered an unnecessary cost for those who did not want it.

jmucchiello
Couldn't they return an invalid reference like if you dereferenced std::map' end()?
n1ck
@n1ck: No. There are no invalid references in C++. (That's one of the main things where they differ from pointers.)
sbi
n1ck, think about the logic flow. To insert, the program has to find what to put the entry in the tree. This processing has a cost. Since the most likely action after not finding the value is to insert it, why not just have \[\] do the insert automagically.
jmucchiello
What then if you dereference something like std::map's end? or if if dereference an invalid pointer? The same behavior could have been implemented: if it doesn't exist: undefined behavior.
n1ck
@jmucchiello: Good comments. I just find the use of operator[] unintuitive for that purpose but yeah your kinda right.
n1ck
@n1ck: The difference is that it's relatively easy and cheap to check whether an iterator equals `end()` or whether an index is in the right range. It's more expensive to check for the existence of an entry in a map. But that doesn't mean you cannot do this before invoking `operator[]`.
sbi
I somewhat totally agree. My point is that why operator[]. I'm not saying a function that do this behavior should not exist, but why operator[]? That's somewhat the only thing we disagree I think. A good answer was that it was a convention before so why not keep it. My question then: why did THEY choose it then?
n1ck
+1  A: 

I think it's mostly because in the case of map (unlike vector, for example) it's fairly cheap and easy to do -- you only have to create a single element. In the case of vector they could extend the vector to make a new subscript valid -- but if your new subscript is well beyond what's already there, adding all the elements up to that point may be fairly expensive. When you extend a vector you also normally specify the values of the new elements to be added (though often with a default value). In this case, there would be no way to specify the values of the elements in the space between the existing elements and the new one.

There's also a fundamental difference in how a map is typically used. With a vector, there's usually a clear delineation between things that add to a vector, and things that work with what's already in the vector. With a map, that's much less true -- it's much more common to see code that manipulates the item that's there if there is one, or adds a new item if it's not already there. The design of operator[] for each reflects that.

Jerry Coffin
Well I'm not sure if adding a new element in a vector when the index is out of range without intervention from the user would exactly be the behavior I want. If I access an out-of-bound array it is probably because I made an error somewhere in my code. I don't want the language to create one for me. I give you +1 for the second part although that's not really my personal experience with maps.
n1ck
A: 

map.insert(key, item); makes sure key is in the map but does not overwrite an existing value.

map.operator[key] = item; makes sure key is in the map and overwrites any existing value with item.

Both of these operations are important enough to warrant a single line of code. The designers probably picked which operation was more intuitive for operator[] and created a function call for the other.

me