views:

239

answers:

2
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <vector>
#include <string>
#include <iostream>
#include <map>
#include <utility>
#include <algorithm>

void * GetMemory(size_t n) {
  void *ptr = malloc(n);
  printf("getMem n %d   ptr 0x%x\n", n, reinterpret_cast<unsigned int> (ptr));
  return ptr;
}

void FreeMemory(void *p) {
  free(p);
}

void* operator new (size_t n) {
  void *p = GetMemory(n);
  return p;
}

void* operator new [] (size_t n) {
  void *p = GetMemory(n);
  return p;
}

void operator delete (void *p) {
  FreeMemory(p);
}

void operator delete [] (void *p) {
  FreeMemory(p);
}

typedef std::vector<int> vec;

int main(int argc, char *argv[]) {
  std::map<int, vec> z;
  vec x;
  z.insert(std::pair<int,vec>(1,x));
}

Compile with g++ -Wall -ansi test.cpp -o test

Run test.

Why are there three calls to GetMemory with n = 0?

+5  A: 

All 3 cases concerned with initialization of empty vector:

  1. to init root element of tree (internal implementation of std::map) that would contain empty vector.
  2. your own initialization of 'vec x'.
  3. copy constructor for std::pair for element 'second' that invokes copying empty set of variable 'x'
Dewfy
Each insert into map z seems to call GetMemory() thrice. Try adding the following code at the bottom:vec y;z.insert(std::pair<int, vec>(2,y);You get three more calls, all with n = 0. And even an empty vector needs 3 pointers: _M_start, _M_finish, and _M_end_of_storage.
@prasoon99: "And even an empty vector needs 3 pointers..." - don't mix pointers and vector instances! Pointers doesn't expects memory allocation, but instance of vector does. Empty vector exactly allocates empty buffer. In your sample it happened 3 times - see my answer above.
Dewfy
Check with gdb. All calls to GetMemory() are on insert statement(s). Vector initialization does not make a call to GetMemory().
It seems to happen on copy constructor calls: `vec a; vec b(a);`
visitor
I think so too. But why three times?
Two of them are freed immediately, and the other one isn't. So the one that isn't is the copy of the empty vector that's actually stored in the map. I think the other two are probably one to copy the empty vector into the pair (freed at the end of the statement that calls `insert`), and one to copy either the pair or just the value somewhere in `insert`. That last one does seem unnecessary.
Steve Jessop
D'oh, I figured it out. See my answer.
Steve Jessop
+7  A: 

Stick some tracing in FreeMemory and change main to this:

int main(int argc, char *argv[]) {
  printf("map\n");
  std::map<int, vec> z;
  printf("vec\n");
  vec x;
  printf("pair\n");
  std::pair<int,vec> y(1,x);
  printf("insert\n");
  z.insert(y);
  printf("inserted 1\n");
  y.first = 2;
  printf("insert\n");
  z.insert(y);
  printf("inserted 2\n");

}

Output:

$ make mapinsert CXXFLAGS=-O3 -B && ./mapinsert
g++ -O3    mapinsert.cpp   -o mapinsert
map
vec
pair
getMem n 0   ptr 0x6b0258
insert
getMem n 0   ptr 0x6b0268
getMem n 32   ptr 0x6b0278
getMem n 0   ptr 0x6b02a0
FreeMemory ptr 0x6b0268
inserted 1
insert
getMem n 0   ptr 0x6b0268
getMem n 32   ptr 0x6b02b0
getMem n 0   ptr 0x6b02d8
FreeMemory ptr 0x6b0268
inserted 2
FreeMemory ptr 0x6b0258
FreeMemory ptr 0x6b02d8
FreeMemory ptr 0x6b02b0
FreeMemory ptr 0x6b02a0
FreeMemory ptr 0x6b0278

So of your 3 0-sized allocations:

  • One is to copy the empty vector into the pair.
  • One is to store a copy of the empty vector in the map.

These two are clearly necessary. What I'm not sure about is this:

  • One is to copy the vector somewhere in the call to insert, and this is also freed in the call to insert.

It's as if insert (or something it calls internally) is taking its parameter by value instead of by reference, or insert is explicitly taking a copy into an automatic variable some time before it allocates the new map node. Firing up a debugger is effort for me at the moment, I'll leave it to someone else.

Edit: mystery solved. insert takes a std::pair<const int, vec>, not a std::pair<int, vec>. The extra copy of an empty vector is because the pair you construct has to be converted to a(nother) temporary, then a reference to that temporary is passed to insert. std::pair has a constructor template that lets you get away with almost anything. 20.2.2/4:

template<class U, class V> pair(const pair<U,V> &p);

Effects: initializes members from the corresponding members of the argument, performing implicit conversions as needed.

I also observe that in my implementation, vec x; doesn't call getMem, but vec x(0); does. So actually:

z[1] = vec();

Is less code and denies you the opportunity to make the extra copy (although it calls operator= instead). It does still make 2 0-sized allocations, at least for me.

The C++ standard defines operator[] to return the result of a specified expression involving a call to insert. I'm not certain whether this means the effects of operator[] are "as if" make_pair and insert were called (that is, the standard is as good as specifying what the source must be for operator[]), or just that the value returned is the same value as the specified expression would yield. If the latter then perhaps an implementation could do it with a single 0-sized allocation. But certainly map has no guaranteed way to create an entry without first creating a pair that contains the mapped type, so 2 allocations should be expected. Or more properly, 2 copies of the desired mapped value: the fact that copying a 0-sized vector makes a 0-sized allocation is implementation-dependent.

So, if you had a case where the value was really expensive to copy, but really cheap to default-construct (like a container with lots of elements), then the following might be useful:

std::map<int, vec> z;
vec x(1000);
z[1] = x;
// i.e. (*(z.insert(std::pair<const int, vec>(1,vec())).first)).second = x;

makes 2 allocations of size 4000 and 2 of size 0, whereas:

std::map<int, vec> z;
vec x(1000);
z.insert(std::pair<const int, vec>(2, x));

makes 3 of size 4000 and none of size 0. Eventually the size is big enough that the extra allocation in the first code is cheaper than the extra copying in the second code.

It's possible that move-constructors in C++0x will help with this, I'm not sure.

Steve Jessop
Thank you for all your effort. If I could give this answer a *, I would.
This is a good illustration of why the best practice when inserting things like vectors into a map is to do typedef vector<T> vec_t; typedef map<int,vec_t> map_t; vec_t dummy;map_t myvals; vec_t tvals(100000,3);/*the vallue*/ myvals.insert(map_t::value_type(1,dummy)).first->second.swap(tvals); This has the effect of only copying an empty vec_t to make the map node, and then moving the values into it once the node has been constructed (or replaced if it already existed). The equivalent C++0x way is myvals.insert(map_t::value_type(1,std::move(tvals)));
Lance Diduck