tags:

views:

4443

answers:

9

I know that calling a virtual method from a base class constructor can be dangerous since the child class might not be in a valid state. (at least in C#)

My question is what if the virtual method is the one who initializes the state of the object ? Is it good practice or should it be a two step process, first to create the object and then to load the state ?

First option: (using the constructor to initialize the state)

public class BaseObject {
    public BaseObject(XElement definition) {
        this.LoadState(definition);
    }

    protected abstract LoadState(XElement definition);
}

Second option: (using a two step process)

public class BaseObject {
    public void LoadState(XElement definition) {
        this.LoadStateCore(definition);
    }

    protected abstract LoadStateCore(XElement definition);
}

In the first method the consumer of the code can create and initialize the object with one statement:

// The base class will call the virtual method to load the state.
ChildObject o = new ChildObject(definition)

In the second method the consumer will have to create the object and then load the state:

ChildObject o = new ChildObject();
o.LoadState(definition);
+13  A: 

(This answer applies to C# and Java. I believe C++ works differently on this matter.)

Calling a virtual method in a constructor is indeed dangerous, but sometimes it can end up with the cleanest code.

I would try to avoid it where possible, but without bending the design hugely. (For instance, the "initialize later" option prohibits immutability.) If you do use a virtual method in the constructor, document it very strongly. So long as everyone involved is aware of what it's doing, it shouldn't cause too many problems. I would try to limit the visibility though, as you've done in your first example.

EDIT: One thing which is important here is that there's a difference between C# and Java in order of initialization. If you have a class such as:

public class Child : Parent
{
    private int foo = 10;

    protected override void ShowFoo()
    {
        Console.WriteLine(foo);
    }
}

where the Parent constructor calls ShowFoo, in C# it will display 10. The equivalent program in Java would display 0.

Jon Skeet
Keep in mind that FxCop will also flag this as an issue, but as Jon points out, sometimes it is the better option. This is one of those FxCop issues that it is important to not ignore unless you fully understand the implications behind in what is happening.
Scott Dorman
@Scott: Agreed. It can be a pretty subtle cause of bugs.
Jon Skeet
Downvoters: please add comments if you think I'm wrong! I want to learn too...
Jon Skeet
+7  A: 

In C++, calling a virtual method in a base class constructor will simply call the method as if the derived class doesn't exist yet (because it doesn't). So that means that the call is resolved at compile time to whatever method it should call in the base class (or classes it derived from).

Tested with GCC, it allows you to call a pure virtual function from a constructor, but it gives a warning, and results in a link time error. It appears that this behavior is undefined by the standard:

"Member functions can be called from a constructor (or destructor) of an abstract class; the effect of making a virtual call (class.virtual) to a pure virtual function directly or indirectly for the object being created (or destroyed) from such a constructor (or destructor) is undefined."

Greg Rogers
+4  A: 

With C++ the virtual methods are routed through the vtable for the class that is being constructed. So in your example it would generate a pure virtual method exception since whilst BaseObject is being constructed there simply is no LoadStateCore method to invoke.

If the function is not abstract, but simply does nothing then you will often get the programmer scratching their head for a while trying to remember why it is that the function doesn't actually get called.

For this reason you simply can't do it this way in C++ ...

Rob Walker
In other words, it's not really routed through the vtable at all. It's called as though it were just an ordinary non-virtual method.
Rob Kennedy
@Rob: No. It is toued through the v-table and fails. It will NOT call the expected method.
Martin York
It's undefined and you therefore can't argue either way. Calling pure virtuals can invoke nasal demons, for all C++ cares.
MSalters
Yes, but calling a virtual method (not a pure virtual method) is perfectly well-defined, and that's what the question was about.
David Thornley
Ah, I see. What I forgot was the case of the virtual function called indirectly, through another member function: If the constructor is still running, the current class's method is called, not the derived version. When called directly from the constructor, it's the same effect as a non-virtual call.
Rob Kennedy
+2  A: 

For C++, section 12.7, paragraph 3 of the Standard covers this case.

To summarize, this is legal. It will resolve to the correct function to the type of the constructor being run. Therefore, adapting your example to C++ syntax, you'd be calling BaseObject::LoadState(). You can't get to ChildObject::LoadState(), and trying to do so by specifying the class as well as the function results in undefined behavior.

Constructors of abstract classes are covered in section 10.4, paragraph 6. In brief, they may call member functions, but calling a pure virtual function in the constructor is undefined behavior. Don't do that.

David Thornley
+1  A: 

Usually you can get around these issues by having a greedier base constructor. In your example, you're passing an XElement to LoadState. If you allow the state to be directly set in your base constructor, then your child class can parse the XElement prior to calling your constructor.

public abstract class BaseObject {
   public BaseObject(int state1, string state2, /* blah, blah */) {
      this.State1 = state1;
      this.State2 = state2;
      /* blah, blah */
   }
}

public class ChildObject : BaseObject {
   public ChildObject(XElement definition) : 
      base(int.Parse(definition["state1"]), definition["state2"], /* blah, blah */) {
   }
}

If the child class needs to do a good bit of work, it can offload to a static method.

Mark Brackett
+3  A: 

For C++ the base constructor is called before the derived constructor, which means that the virtual table (which holds the addresses of the derived class's overridden virtual functions) does not yet exist. For this reason, it is considered a VERY dangerous thing to do (especially if the functions are pure virtual in the base class...this will cause a pure-virtual exception).

There are two ways around this:

  1. Do a two-step process of construction + initialization
  2. Move the virtual functions to an internal class that you can more closely control (can make use of the above approach, see example for details)

An example of (1) is:

class base
{
public:
    base()
    {
      // only initialize base's members
    }

    virtual ~base()
    {
      // only release base's members
    }

    virtual bool initialize(/* whatever goes here */) = 0;
};

class derived : public base
{
public:
    derived ()
    {
      // only initialize derived 's members
    }

    virtual ~derived ()
    {
      // only release derived 's members
    }

    virtual bool initialize(/* whatever goes here */)
    {
      // do your further initialization here
      // return success/failure
    }
};

An example of (2) is:

class accessible
{
private:
    class accessible_impl
    {
    protected:
        accessible_impl()
        {
          // only initialize accessible_impl's members
        }

    public:
        static accessible_impl* create_impl(/* params for this factory func */);

        virtual ~accessible_impl()
        {
          // only release accessible_impl's members
        }

        virtual bool initialize(/* whatever goes here */) = 0;
    };

    accessible_impl* m_impl;

public:
    accessible()
    {
        m_impl = accessible_impl::create_impl(/* params to determine the exact type needed */);

        if (m_impl)
        {
            m_impl->initialize(/* ... */);  // add any initialization checking you need
        }
    }

    virtual ~accessible()
    {
        if (m_impl)
        {
            delete m_impl;
        }
    }

    /* Other functionality of accessible, which may or may not use the impl class */
};

Approach (2) uses the Factory pattern to provide the appropriate implementation for the accessible class (which will provide the same interface as your base class). One of the main benefits here is that you get initialization during construction of accessible that is able to make use of virtual members of accessible_impl safely.

Brian
+2  A: 

If you have a class as shown in your post, which takes an XElement in the constructor, then the only place that XElement could have come from is the derived class. So why not just load the state in the derived class which already has the XElement.

Either your example is missing some fundamental information which changes the situation, or there's simply no need to chain back up to the derived class with the information from the base class, because it has just told you that exact information.

i.e.

public class BaseClass
{
    public BaseClass(XElement defintion)
    {
        // base class loads state here
    }
}

public class DerivedClass : BaseClass
{
    public DerivedClass (XElement defintion)
        : base(definition)
    {
        // derived class loads state here
    }
}

Then your code's really simple, and you don't have any of the virtual method call problems.

Greg Beech
A: 

In C++ it is perfectly safe to call virtual functions from within the base-class - as long as they are non-pure - with some restrictions. However, you shouldn't do it. Better initialize objects using non-virtual functions, which are explicitly marked as being such initialization functions using comments and an appropriate name (like initialize). If it is even declared as pure-virtual in the class calling it, the behavior is undefined.

The version that's called is the one of the class calling it from within the constructor, and not some overrider in some derived class. This hasn't got much to-do with virtual function tables, but more with the fact that the override of that function might belong to a class that's not yet initialized. So this is forbidden.

In C# and Java, that's not a problem, because there is no such thing as a default-initialization that's done just before entering the constructor's body. In C#, the only things that are done outside the body is calling base-class or sibling constructors i believe. In C++, however, initializations done to members of derived classes by the overrider of that function would be undone when constructing those members while processing the constructor initializer list just before entering the constructors body of the derived class.

Edit: Because of a comment, i think a bit of clarification is needed. Here's an (contrived) example, let's assume it would be allowed to call virtuals, and the call would result in an activation of the final overrider:

struct base {
    base() { init(); }
    virtual void init() = 0;
};

struct derived : base {
    derived() {
        // we would expect str to be "called it", but actually the
        // default constructor of it initialized it to an empty string
    }
    virtual void init() {
        // note, str not yet constructed, but we can't know this, because
        // we could have called from derived's constructors body too
        str = "called it";
    }
private:
    string str;
};

That problem can indeed be solved by changing the C++ Standard and allowing it - adjusting the definition of constructors, object lifetime and whatnot. Rules would have to be made to define what str = ...; means for a not-yet constructed object. And note how the effect of it then depends on who called init. The feature we get does not justify the problems we have to solve then. So C++ simply forbids dynamic dispatch while the object is being constructed.

Johannes Schaub - litb
Your last paragraph is incorrect. Any initializers (initialization defined outside the constructor body) is run in order from most derived to least derived, and then subsequently the constructor bodies are run in order of least derived to most derived.
Greg Beech
fortunately, i explained the order thing in another answer: http://stackoverflow.com/questions/375141/how-to-pass-method-result-as-parameter-to-base-class-constructor-in-c#375195 :)
Johannes Schaub - litb
so while your comment is partially right, it's incomplete. because initializer lists interleave. data-members on the one side and base-class subobjects on the other side. the base-classes are initialized in order from derived-to-base, but data-members are initialized in order from base-to-derived.
Johannes Schaub - litb
A: 

For C++, read Scott Meyer's corresponding article :

Never Call Virtual Functions during Construction or Destruction

ps: pay attention to this exception in the article:

The problem would almost certainly become apparent before runtime, because the logTransaction function is pure virtual in Transaction. Unless it had been defined (unlikely, but possible) the program wouldn't link: the linker would be unable to find the necessary implementation of Transaction::logTransaction.

Comptrol