ansaurus

Question

Why does std::map operator[] create an object if the key doesn't exist?

Answer 1

+4 A:

Standard says (23.3.1.2/1) that operator[] returns (*((insert(make_pair(x, T()))).first)).second. That's the reason. It returns reference T&. There is no way to return invalid reference. And it returns reference because it is very convenient I guess, isn't it?

Kirill V. Lyadvinsky 2009-10-28 19:34:11

Sorry, maybe my question wasn't clear enought but I'm asking the reason for the standard.

n1ck 2009-10-28 19:36:00

Ok, it is convenient that `operator[]` returns reference. So it should insert something, because there is no way to return invalid reference.

Kirill V. Lyadvinsky 2009-10-28 19:37:36

Look at Samuel's answer: http://stackoverflow.com/questions/1639544/1639573#1639573

sbi 2009-10-28 19:37:53

Why don't they make a const version that throws an exception?

GMan 2009-10-28 19:40:29

Because exception is for exceptional cases.

Kirill V. Lyadvinsky 2009-10-28 19:41:32

@GMan: If the map's `operator[]` wouldn't create values, it would be a lookup function. And you don't want your lookup function to throw if it doesn't find what you're looking for, do you?

sbi 2009-10-28 19:42:44

@sbi: unless you aldready know that it exist? But yeah I just realised that if it worked like I wanted it would basically be a find, thanks.

n1ck 2009-10-28 20:00:36

Pretty much along nicks thought. Ever time I've ever used a map for look-up, I've had to end up checking if find returned end, then throw an exception, because it's the only appropriate thing to do.

GMan 2009-10-28 20:02:01

@n1ck: If you already know that it exists, then what's wrong with `operator[]` as it is?

sbi 2009-10-28 21:06:31

@GMan: Look at my comments to this question: http://stackoverflow.com/questions/1462341/

sbi 2009-10-28 21:08:45

because it would assert in debug mode if it didn't instead of just creating an (possibly invalid since second element is possibly default constructed) element without telling me, meaning I have to catch the error later.

n1ck 2009-10-28 21:15:17

@n1ck: In debug mode, an additional `assert(my_map.find(key)!=my_map.end());` would do.

sbi 2009-10-29 09:37:42

I know but it's more code. I guess I could always add a wrapper class that provide the behavior I want if I really wanted to. What really caused me to ask is that I replaced a vector with a map, counting on the compiler to find me errors (didn't think about map's []) and instead of asserting right when I accessed an invalid index (my key was an int) it crashed later when I tried to use the (invalid) element it had created. It was not the first time I had problem with this behavior of map so I decided to ask the question. Seeing the answers I guess I'm the only one who do. ...

n1ck 2009-10-29 11:13:11

Anyway thanks for the discussion it made me found some point I hadn't thought about. Still I stay by my point that this behavior of operator[] is unintuitive compared to vector's.

n1ck 2009-10-29 11:18:34

Answer 2

+3 A:

Its is for assignment purposes:


void test()
{
   std::map<std::string, int >myMap;
   myMap["hello"] = 5;
}

EToreo 2009-10-28 19:34:50

If this was the main reason, then a constant map would return a constant reference which cannot be assigned to and `operator[]` for constant maps wouldn't have to create a value. But there isn't even an `operator[]` for constant maps, yo you must be wrong.

sbi 2009-10-28 19:40:10

I'm not sure how that makes me wrong... it seems like if there is no <pre>operator[]</pre> on a constant map, it just proves my point.

EToreo 2009-10-28 20:17:43

That there is no `operator[]` for constant maps seems to indicate that `operator[]` is there *purely* to modify maps. I somehow must be misunderstanding the point you're making. If the designers of the STL didn't think that an `operator[]` of any kind would make sense for constant maps, they obviously saw the whole meaning of `operator[]` in modifying the map, like with the assignment in the answer...

sth 2009-10-28 20:18:24

Having `operator[]` have two vastly different semantics depending on whether the map is `const` or not sounds like a bad idea to me. The way it is, `[]` has certain well-defined semantics which is simply not supported for `const` map - hence it's not there.

Pavel Minaev 2009-10-28 20:19:29

sbi 2009-10-28 21:04:06

Sounds about right.

EToreo 2009-10-28 21:46:01

Answer 3

+1 A:

The difference here is that map stores the "index", i.e. the value stored in the map (in its underlying RB tree) is a std::pair, and not just "indexed" value. There's always map::find() that would tell you if pair with a given key exists.

Nikolai N Fetissov 2009-10-28 19:34:53

Answer 4

+2 A:

It allows insertion of new elements with operator[], like this:

std::map<std::string, int> m;
m["five"] = 5;

The 5 is assigned to the value returned by m["five"], which is a reference to a newly created element. If operator[] wouldn't insert new elements this couldn't work that way.

sth 2009-10-28 19:35:14

see my comment at http://stackoverflow.com/questions/1639544/1639563#1639563 for why I believe this is wrong.

sbi 2009-10-28 19:40:47

Answer 5

+8 A:

Because operator[] returns a reference to the value itself and so the only way to indicate a problem would be to throw an exception (and in general, the STL rarely throws exceptions).

If you don't like this behavior, you can use map::find instead. It returns an iterator instead of the value. This allows it to return a special iterator when the value is not found (it returns map::end) but also requires you to dereference the iterator to get at the value.

R Samuel Klatchko 2009-10-28 19:36:27

but doesn't vector return a reference too? what's the difference here? Sorry I'm not familiar with std::map's internal so maybe this is obvious from the internal.

n1ck 2009-10-28 19:43:13

It is undefined behaviour to access a vector out of bounds with []. I guess map could invoke some undefined behaviour in this case as well, but would you really like that? With a vector it is easy to find out if an index is valid, with a map it isn't (you could call find first, and in this case it would be pointless to call[] as you should have already found the item you are looking for).

UncleBens 2009-10-28 19:47:25

@n1ck: std::vector will return an invalid reference (becuase it is easy todo). std::map does not have an easy way to return a reference.

Martin York 2009-10-28 19:48:37

@UncleBens: yes I guess what you say make sense. If operator[] was working like I would find more intuitive, it would basically only be a find. Thanks, I didn't realise that. Still I don't like the way it works. I know that I don't have to use it if I don't like it but I asked this question because I just replaced a vector with a map, counting on the compiler to find me the errors and I got a runtime error because of this behavior. I guess that's what I get for counting on the compiler to find my errors when I could have done better ;o)

n1ck 2009-10-28 19:58:20

Traditionally, the interface for maps (in all languages, not just C++) allow you to write something like `map[key] = value;`, and the map would replace the existing value _or_ insert a new one as needed. It's a long standing convention, so it's no surprise that `std::map` follows it. However, since `operator[]` can't tell if the reference it has to return will be used on left side of `operator=` or not, it has to always assume the worst, and therefore insert a new element.

Pavel Minaev 2009-10-28 20:18:02

@Pavel Minaev: Thanks I didn't know that other language before c++ implemented it that way. Any exemple?

n1ck 2009-10-28 20:27:57

C#, VB, Delphi, Python, Ruby, Eiffel, JavaScript, Lua are among those which specifically have `[]` notation or a precise equivalent (though in all those languages, the map can distinguish between retrieval of value via `[]` and assignment to `[]`, so this issue is non-existent there). Java doesn't have `[]`, but `Map.put` has similar insert-or-replace semantics. In reality, the complete list would probably be several pages old - it's a very old convention. This particular wart with default-initialization is specific to C++ only because there's no good way to distinguish assignment.

Pavel Minaev 2009-10-28 20:35:52

Most of these language are not older than c++ so the decision cannot come from doing like them. But it's ok I'm sure you're right I was just curious which language could have inspired c++ for that decision.

n1ck 2009-10-28 20:46:13

@N1ck: They might be older that the std library, though. `:)`

sbi 2009-10-28 20:48:00

Answer 6

+6 A:

To answer your real question: there's no convincing explanation as to why it was done that way. "Just because".

Since std::map is an associative container, there's no clear pre-defined range of keys that must exist (or not exist) in the map (as opposed to the completely different situation with std::vector). That means that with std::map, you need both non-insering and inserting lookup functionality. One could overload [] in non-inserting way and provide a function for insertion. Or one could do the other way around: overload [] as an inserting operator and provide a function for non-inserting search. So, someone sometime decided to follow the latter approach. That's all there's to it.

If they did it the other way around, maybe today someone would be asking here the reverse version of your question.

AndreyT 2009-10-28 20:12:13

I guess this is what I wanted to know, Thanks. Well a third way would have been not to provide an operator[] and provide lookup and insertion in two member function (no operator overloading).

n1ck 2009-10-28 20:17:05

@n1ck: The [] notation is pretty well fixed in people's minds. It would have been theoretically possible to not use it, but I don't think that was very likely.

David Thornley 2009-10-28 20:47:45

The reason given above by Pavel Minaev is pretty convincing.

Stuart 2009-10-29 11:48:16

Which one? That this is a "long standing convention"? Might be true, but not convincing. That there's no way to know whether it is on the left-hand side of assignment? This is not convincing at all, since it could just throw an exception or claim undefined behavior. Remember, the original question was why it is different from `std::vector`.

AndreyT 2009-10-29 13:47:16

Answer 7

+1 A:

The answer is because they wanted an implementation that is both convenient and fast.

The underlying implementation of a vector is an array. So if there are 10 entries in the array and you want entry 5, the T& vector::operator[](5) function just returns headptr+5. If you ask for entry 5400 it returns headptr+5400.

The underlying implementation of a map is usually a tree. Each node is allocated dynamically, unlike the vector which the standard requires to be contiguous. So nodeptr+5 doesn't mean anything and map["some string"] doesn't mean rootptr+offset("some string").

Like find with maps, vector has getAt() if you want bounds checking. In the case of vectors, bounds checking was considered an unnecessary cost for those who did not want it. In the case of maps, the only way not to return a reference is to throw an exception and that was also considered an unnecessary cost for those who did not want it.

jmucchiello 2009-10-28 20:29:23

Couldn't they return an invalid reference like if you dereferenced std::map' end()?

n1ck 2009-10-28 20:32:36

@n1ck: No. There are no invalid references in C++. (That's one of the main things where they differ from pointers.)

sbi 2009-10-28 20:50:17

n1ck, think about the logic flow. To insert, the program has to find what to put the entry in the tree. This processing has a cost. Since the most likely action after not finding the value is to insert it, why not just have \[\] do the insert automagically.

jmucchiello 2009-10-28 20:59:43

What then if you dereference something like std::map's end? or if if dereference an invalid pointer? The same behavior could have been implemented: if it doesn't exist: undefined behavior.

n1ck 2009-10-28 21:01:38

@jmucchiello: Good comments. I just find the use of operator[] unintuitive for that purpose but yeah your kinda right.

n1ck 2009-10-28 21:02:47

@n1ck: The difference is that it's relatively easy and cheap to check whether an iterator equals `end()` or whether an index is in the right range. It's more expensive to check for the existence of an entry in a map. But that doesn't mean you cannot do this before invoking `operator[]`.

sbi 2009-10-29 09:40:44

I somewhat totally agree. My point is that why operator[]. I'm not saying a function that do this behavior should not exist, but why operator[]? That's somewhat the only thing we disagree I think. A good answer was that it was a convention before so why not keep it. My question then: why did THEY choose it then?

n1ck 2009-10-29 11:21:47

Answer 8

+1 A:

I think it's mostly because in the case of map (unlike vector, for example) it's fairly cheap and easy to do -- you only have to create a single element. In the case of vector they could extend the vector to make a new subscript valid -- but if your new subscript is well beyond what's already there, adding all the elements up to that point may be fairly expensive. When you extend a vector you also normally specify the values of the new elements to be added (though often with a default value). In this case, there would be no way to specify the values of the elements in the space between the existing elements and the new one.

There's also a fundamental difference in how a map is typically used. With a vector, there's usually a clear delineation between things that add to a vector, and things that work with what's already in the vector. With a map, that's much less true -- it's much more common to see code that manipulates the item that's there if there is one, or adds a new item if it's not already there. The design of operator[] for each reflects that.

Jerry Coffin 2009-10-28 20:30:37

Well I'm not sure if adding a new element in a vector when the index is out of range without intervention from the user would exactly be the behavior I want. If I access an out-of-bound array it is probably because I made an error somewhere in my code. I don't want the language to create one for me. I give you +1 for the second part although that's not really my personal experience with maps.

n1ck 2009-10-28 20:48:51

Answer 9

A:

map.insert(key, item); makes sure key is in the map but does not overwrite an existing value.

map.operator[key] = item; makes sure key is in the map and overwrites any existing value with item.

Both of these operations are important enough to warrant a single line of code. The designers probably picked which operation was more intuitive for operator[] and created a function call for the other.

me 2009-12-18 15:26:41

ansaurus

tags:

views:

answers:

Why does std::map operator[] create an object if the key doesn't exist?

related questions