views:

215

answers:

4

When I create a std::vector of objects, the constructor of these objects is not always called.

#include <iostream>
#include <vector>
using namespace std;

struct C {
    int id;
    static int n;
    C() { id = n++; }   // not called
//  C() { id = 3; }     // ok, called
};

int C::n = 0;


int main()
{
    vector<C> vc;

    vc.resize(10);

    cout << "C::n = " << C::n << endl;

    for(int i = 0; i < vc.size(); ++i)
        cout << i << ": " << vc[i].id << endl;  
}

This is the output I get:

C::n = 1
0: 0
1: 0
2: 0
...

This is what I would like:

C::n = 10
0: 0
1: 1
2: 2
...

In this example, am I forced to resize the vector and then initialise its elements "manually"?
Could the reason be that the elements of a vector are not initialised in an ordered way, from the first to the last, and so I cannot obtain a deterministic behaviour?

What I would like to do, is to easily count the number of objects created in a program, in different containers, in different points of the code, and to give a single id to each of them.

Thank's!

+20  A: 

the constructor of these objects is not always called.

Yes, it is, but it's not the constructor you think. The member function resize() is actually declared like this:

void resize(size_type sz, T c = T());

The second parameter is the object to copy into each of the newly inserted elements of the vector. If you omit the second parameter, it default constructs an object of type T then copies that object into each of the new elements.

In your code, a temporary C is constructed and the default constructor is called; id is set to 0. The implicitly declared copy constructor is then called ten times (to insert ten elements into the vector), and all of the elements in the vector have the same id.

[Note for those who are interested: in C++03, the second parameter of resize() (c) is taken by value; in C++0x it is taken by const lvalue reference (see LWG Defect 679)].

In this example, am I forced to resize the vector and then initialise its elements "manually"?

You can (and probably should) insert the elements into the vector individually, e.g.,

std::vector<C> vc;
for (unsigned i(0); i < 10; ++i)
    vc.push_back(C());
James McNellis
+3  A: 

The reason is that vector::resize inserts copies by calling the automatically provided copy constructor, rather than the constructors you have defined in your example.

In order to get the output you want, you can define the copy constructor explicitly:

struct C {
//....
C(const C& other) {
    id = n++;
    // copy other data members
}
//....
};

Because of the way vector::resize works though (it has a second optional argument used as a 'prototype' for the copies it creates, with a default value in your case of C()), this creates 11 objects in your example (the 'prototype' and 10 copies of it).

Edit (to include some of the good advice in the many comments):

There are several downsides to this solution worth noting, as well as some options and variants that are likely to yield more maintainable and sensible code.

  • This method does add maintenance costs, and an amount of risk. You have to remember to modify your copy constructor whenever you add or remove members variables of the class. You don't have to do that if you rely on the default copy constructor. One way to combat this problem is to encapsulate the counter in another class (like this), which is also arguably better OO design, but then of course you also have to keep in mind the many issues that can crop up with multiple inheritance.

  • It can make it harder for other people to understand, because a copy is no longer exactly what most people would expect. Similarly, other code that deals with your classes (including the standard containers) may misbehave. One way to combat this is to define an operator== method for your class (and it may be argued that this is a good idea when overriding the copy constructor even if you don't use the method), to keep it conceptually 'sound' and also as a kind of internal documentation. If your class gets much use, you will likely also end up providing an operator= so that you can maintain the separation of your automatically generated instance id from class member assignments that should take place under this operator. And so on ;)

  • It might disambiguate the whole issue of 'different id values for copies' if you have enough control over the program to use dynamically created instances (via new) and use pointers to those inside containers. This does mean you need to 'initialise elements "manually"' to some degree - but it's not a lot of work to write a function that gives you back a vector full of pointers to new, initialised instances. If you consistently deal with pointers when using standard containers, you won't have to worry about the standard containers creating any instances 'under the covers'.

If you're aware of all those issues, and believe you can cope with the consequences (which is of course highly dependent on your particular context), then overriding the copy constructor is a viable option. After all, the language feature is there for a reason. Obviously, it is not as simple as it looks, and you should be careful.

sje397
This won't do. The copy-constructor needs to copy; if it doesn't the class cannot be used in a standard container. I'd be weary of making a copy-constructor that doesn't actually copy anything.
GMan
GMan: let's say that the class has some data that need to be copied, and a member variable which has to be unique (the id). Isn't it correct, in this case, to write a copy constructor that copies just what it can, and assigns a new, unique value to the id?If this is a bad way to proceed, is there an alternative?
Pietro
@Pietro: Generally, after invoking the copy constructor, the copy should be the same as the original (i.e., `original == copy` should be true). If not, you can end up with bizarre problems that can be difficult to debug or your code will be difficult to understand. It is rare that each object of a type needs an absolutely unique identifier (and if it does, often its address can be used for such a purpose). What is your use case that requires this unique identifier?
James McNellis
@James: I added a separate answer to reply to your comment. Thank's
Pietro
@GMan - the class can (and is in this example) be used with standard containers. I take your point that this isn't usually the best idea, but if he really wants to count the number of times a class has been instantiated, as he said in the question, then I think calling this a 'hack' is a little harsh.
sje397
@sje: No it cannot be used within a standard container. Read the standard, specifically 20.1.3. For C to be used within a standard container, `C(x)` has to be equal to `x`. That is, a C copy-constructed with `x` *must be identical to `x`*. Your code fails in this regard, and therefore it *cannot* be used in a standard container. Just because it happens to "work" on your implementation in this tiny test doesn't make it a viable solution, it's non-standard. This is a hack. To make it work, you'd need to override `operator=` and make sure `id` has nothing to do with two things being identical.
GMan
@GMan - Nobody else so far has provided the poor guy with any other way of counting *every instance* of his class, which is what he wants. Why don't you post the 'proper' answer?
sje397
@sje: We've been answering the title question. He should ask the question "how can I safely track instances of objects" to get an answer. :) If the OP illuminates his goal, I'll be happy to provide an answer.
GMan
@GMan: I reformulate my question: "how can I safely track instances of objects?" Thank you.
Pietro
@Pietro: The simplest solution is to make them noncopyable and use containers of pointers.
James McNellis
I've tried to ask that question [here](http://stackoverflow.com/questions/3277961/how-can-i-safely-and-easily-count-all-instances-of-a-class-within-my-program).
sje397
@GMan @James @Pietro: I've edited to try and summarise the main points in the comments.
sje397
@sje: +1 for asking questions to do research, and another +1 for adding a summary of those questions to your answer. Ergo I've removed my downvote and made it an upvote. I'm still with James on this one though. :)
GMan
@sje: +1 for effectively the same reasons as mentioned by GMan.
James McNellis
+3  A: 

The vector is using the copy constructor the c++ generates for you without asking. One "C" is instantiated, the rest is copied from the prototype.

jdv
A: 

@James: Let's say that I have to be able to distinguish every object, even if more than one can (temporarily) have the same value. Its address is not something I would trust so much, due to vector's reallocations. Furthermore, different objects can be in different containers. Are the problems you mention just related to the followed conventions, or can be there real technical problems with such code? The test I did works well.
This is what I mean:

#include <iostream>
#include <vector>
#include <deque>
using namespace std;

struct C {
    int id;
    static int n;
    int data;

    C() {               // not called from vector
        id = n++;
        data = 123;
    }

    C(const C& other) {
        id = n++;
        data = other.data;
    }

    bool operator== (const C& other) const {
        if(data == other.data)      // ignore id
            return true;
        return false;
    }
};

int C::n = 0;


int main()
{
    vector<C> vc;
    deque<C> dc;

    vc.resize(10);

    dc.resize(8);

    cout << "C::n = " << C::n << endl;

    for(int i = 0; i < vc.size(); ++i)
        cout << "[vector] " << i << ": " << vc[i].id << ";  data = " << vc[i].data << endl;

    for(int i = 0; i < dc.size(); ++i)
        cout << "[deque] " << i << ": " << dc[i].id << ";  data = " << dc[i].data << endl;
}

Output:

C::n = 20
[vector] 0: 1;  data = 123
[vector] 1: 2;  data = 123
[vector] 2: 3;  data = 123
[vector] 3: 4;  data = 123
[vector] 4: 5;  data = 123
[vector] 5: 6;  data = 123
[vector] 6: 7;  data = 123
[vector] 7: 8;  data = 123
[vector] 8: 9;  data = 123
[vector] 9: 10;  data = 123
[deque] 0: 12;  data = 123
[deque] 1: 13;  data = 123
[deque] 2: 14;  data = 123
[deque] 3: 15;  data = 123
[deque] 4: 16;  data = 123
[deque] 5: 17;  data = 123
[deque] 6: 18;  data = 123
[deque] 7: 19;  data = 123
Pietro
@Pietro - first, sorry for the drama my suggestion seems to have caused :) You're right in that the address isn't completely trustworthy, as if an instance is destroyed and another created, there is a chance that the second may have the same address as the first. It sounds to me like overriding the copy constructor is not an improper method for your situation. Usually, as noted, it's not the best idea, but there is an exception to every rule ;)
sje397
@sje: Stop giving advice, you're wrong. Overriding the copy constructor *is a hack*. It's bad design and only "works" in this tiny little test, breaking the practicality of the class in any other case. @Pietro: Do *not* do this, it's terrible code. sje's code makes your class unusable in standard containers.
GMan
_I have to be able to distinguish every object, even if more than one can (temporarily) have the same value._ You have to define your notion of temporary vs. persistent here. You can use an incrementing identifier to identify objects, but then either the objects should be noncopyable or the copies should have the same id. If a copy has a different id, then it's not a copy.
James McNellis
@GMan: No. I like giving advice (and, BTW, that first sentence in your last comment is grammatically incorrect). I'll trust in the process. Whether a design is 'bad' or not is subjective, and depends not only on the problem but also on the alternative designs. Also note that I have never suggested that he not override other operators to make his class conform to standard container requirements.
sje397
@James: He wants to count instances including those created as copies by standard containers. And, his 'id' is an 'instance id', so no two instances can have the same value for id. So saying 'either the objects should be noncopyable or the copies should have the same id' is just another way of saying 'there is no solution to your problem'.
sje397
@sje397: The solution is to make the class noncopyable; if a container of the type is needed, then a container of pointers to that type can be used.
James McNellis
@sje: But you also didn't actually do it, which isn't the same. Overriding the copy-constructor without making sure `T(x) == x` holds is a wrong answer, making the class unusable. And I'll discuss things from an ivory tower all day, but at the end of the day the class simply isn't usable in a standard container and suggesting such a non-usable class be used is "bad". And no, @James is right: The class needs to be noncyopable. It does *not* make conceptual sense to *distinguish every class*, whilst allowing them to be copied. If a single one is copied, then not ever class is distinguished.
GMan
@GMan: I didn't do it because 1) there was no other data in his original class, and 2) there was no `operator==` either, so `T(x) == x` doesn't hold irrespective of the addition of the copy constructor override. I do agree with James that a container of pointers is probably the best way to disambiguate the whole 'instance identity' issue - but the OP asked 'am I forced to ... initialize its elements "manually"?' which, strictly, he is not. I think whether it makes conceptual sense depends on the concept that 'id' is representing. I'm thinking of it more as an 'aspect' outside instance state.
sje397
@sje397: You are right: the id is an 'aspect' outside instance state, i.e. it is a 'special' member variable. Two objects have the same value if the values of their member variables (except the id) are the same. If even the id is the same, it means we are comparing an object with itself. I modified my code to reflect this.
Pietro