views:

177

answers:

5

Consider the sample application below. It demonstrates what I would call a flawed class design.

#include <iostream>

using namespace std;

struct B
{
 B() : m_value(1) {}

 long m_value;
};

struct A
{
 const B& GetB() const { return m_B; }

 void Foo(const B &b)
 {
  // assert(this != &b);
  m_B.m_value += b.m_value;
  m_B.m_value += b.m_value;
 }

protected:
 B m_B;
};

int main(int argc, char* argv[])
{
 A a;

 cout << "Original value: " << a.GetB().m_value << endl;

 cout << "Expected value: 3" << endl;
 a.Foo(a.GetB());

 cout << "Actual value: " << a.GetB().m_value << endl;

 return 0;
}

Output:
Original value: 1
Expected value: 3
Actual value: 4

Obviously, the programmer is fooled by the constness of b. By mistake b points to this, which yields the undesired behavior.

My question: What const-rules should you follow when designing getters/setters?

My suggestion: Never return a reference to a member variable if it can be set by reference through a member function. Hence, either return by value or pass parameters by value. (Modern compilers will optimize away the extra copy anyway.)

A: 

My stupid answer, I leave it here just in case someone else comes up with the same bad idea:

The problem is I think that the object referred to is not const (B const & vs const B &), only the reference is const in your code.

jdehaan
Nope. The whole idea of my code sample is to illustrate that there is no guarantee that a const-referenced object is not altered when altering the this-pointer.
NOP slider
Alex B
`const` applies to whatever stands left from it. `const` on the right is a syntax sugar. iow `const B const` is the same as `const B` and `B const`
Dummy00001
@Alex by trying out I figured out it is exactly the same.
jdehaan
Moreover, a const reference means nothing since it cannot be "reset" to alias another object.
Luc Touraille
+17  A: 

Obviously, the programmer is fooled by the constness of b

As someone once said, You keep using that word. I do not think it means what you think it means.

Const means that you cannot change the value. It does not mean that the value cannot change.

If the programmer is fooled by the fact that some other code else can change something that they cannot, they need a better grounding in aliasing.

If the programmer is fooled by the fact that the token 'const' sounds a bit like 'constant' but means 'read only', they need a better grounding in the semantics of the programming language they are using.

So if you have a getter which returns a const reference, then it is an alias for an object you don't have the permission to change. That says nothing about whether its value is immutable.


Ultimately, this comes down to a lack of encapsulation, and not applying the Law of Demeter. In general, don't mutate the state of other objects. Send them a message to ask them to perform an operation, which may (depending on their own implementation details) mutate their state.

If you make B.m_value private, then you can't write the Foo you have. You either make Foo into:

void Foo(const B &b)
{
    m_B.increment_by(b);
    m_B.increment_by(b);
}

void B::increment_by (const B& b)
{
    // assert ( this != &b ) if you like 
    m_value += b.m_value;
}

or, if you want to ensure that the value is constant, use a temporary

void Foo(B b)
{
    m_B.increment_by(b);
    m_B.increment_by(b);
}

Now, incrementing a value by itself may or may not be reasonable, and is easily tested for within B::increment_by. You could also test whether &m_b==&b in A::Foo, though once you have a couple of levels of objects and objects with references to other objects rather than values (so &a1.b.c == &a2.b.c does not imply that &a1.b==&a2.b or &a1==&a2), then you really have to just be aware that any operation is potentially aliased.

Aliasing means that incrementing by an expression twice is not the same as incrementing by the value of the expression the first time you evaluated it; there's no real way around it, and in most systems the cost of copying the data isn't worth the risk of avoiding the alias.

Passing in arguments which have the least structure also works well. If Foo() took a long rather than an object which it has to get a long from, then it would not suffer aliasing, and you wouldn't need to write a different Foo() to increment m_b by the value of a C.

Pete Kirkham
+1 for translating `const` to `readonly`!
xtofl
By the way, Stroustrup originally wanted the keyword `readonly`, but the C (!) committee insisted on naming it `const`.
FredOverflow
That's what the sample illustrates, yes. Obviously, I wasn't clear enough. There must be some rules to follow (guidelines) to make sure the mistake in the code cannot happen.
NOP slider
If the behaviour is to be based on the values of the arguments at the point at which the function is called, then store those values in temporary variables. Some compilers support `restrict` to highlight aliasing. But there aren't any simple guidelines to avoid aliasing; assume that the values might be aliased unless told otherwise.
Pete Kirkham
+1  A: 

The real problem here is atomicity. The precondition of the Foo function is that it's argument doesn't change while in use.

If e.g. Foo had been specified with a value-argument i.s.o. reference argument, no problem would have shown.

xtofl
Exactly. Perhaps such preconditions should be considered when choosing to return a reference or a value?
NOP slider
@Kristoffer: this precondition is quite hard to rely on if you pass an argument by-reference. So either you build a system that doesn't allow changing arguments to be used (i.e. by locking the argument or the like), or you take a snapshot of the argument (if that can be done atomically).
xtofl
+1  A: 

Frankly, A::Foo() rubs me the wrong way more than your original problem. Anyhow I look at it, it must be B::Foo(). And inside B::Foo() check for this wouldn't be that outlandish.

Otherwise I do not see how one can specify a generic rule to cover that case. And keep teammates sane.

From past experience, I would treat that as a plain bug and would differentiate two cases: (1) B is small and (2) B is large. If B is small, then simply make A::getB() to return a copy. If B is large, then you have no choice but to handle the case that objects of B might be both rvalue and lvalue in the same expression.

If you have such problems constantly, I'd say simpler rule would be to always return a copy of an object instead of a reference. Because quite often, if object is large, then you have to handle it differently from the rest anyway.

Dummy00001
+1  A: 
Grant Peters