views:

323

answers:

9

I have an abstract class in a library. I'm trying to make it as easy as possible to properly implement a derivation of this class. The trouble is that I need to initialize the object in a three-step process: grab a file, do a few intermediate steps, and then work with the file. The first and last step are particular to the derived class. Here's a stripped-down example.

abstract class Base
{
    // grabs a resource file specified by the implementing class
    protected abstract void InitilaizationStep1();

    // performs some simple-but-subtle boilerplate stuff
    private void InitilaizationStep2() { return; }

    // works with the resource file
    protected abstract void InitilaizationStep3();

    protected Base()
    {
        InitilaizationStep1();
        InitilaizationStep2();
        InitilaizationStep3();
    }
}

The trouble, of course, is the virtual method call in the constructor. I'm afraid that the consumer of the library will find themselves constrained when using the class if they can't count on the derived class being fully initialized.

I could pull the logic out of the constructor into a protected Initialize() method, but then the implementer might call Step1() and Step3() directly instead of calling Initialize(). The crux of the issue is that there would be no obvious error if Step2() is skipped; just terrible performance in certain situations.

I feel like either way there is a serious and non-obvious "gotcha" that future users of the library will have to work around. Is there some other design I should be using to achieve this kind of initialization?

I can provide more details if necessary; I was just trying to provide the simplest example that expressed the problem.

+3  A: 

That's way too much to place in the constructor of any class, much less of a base class. I suggest you factor that out into a separate Initialize method.

John Saunders
I agree. Another approach would be to service out the dependencies this object has during creation.
Nissan Fan
I like your solution, but the other possibility to consider might be to break the code in steps 1 and 3 out into a pair of interfaces that are passed in to the constructor. The constructor can call them at the right moment to do their part. In short, this may well be one of those places where composition beats out inheritance.
Steven Sudit
This does not address his concerns about calling steps out of sequence and a way to simplify create of derived classes.
eschneider
First, unless there's something about this class hierarchy that makes it inherent that init will always take three steps, in this sequence, then it should not be imposed by the base class. Second, if it _is_ inherent, then the base class Initialize method can call virtual Step1, Step2, Step3 initialization methods, in sequence: the Template Function Pattern.
John Saunders
The sequence is inherent. Basically, a file is loaded in Step1, and Step3 defines a binding from program data onto the file. Because the binding process is expensive, Step2 makes sure the binding is lazy and cached. My current implementation is basically a Template Method Pattern, but the safe, idiomatic TMP in C# requires an explicit call to the Initialize() method either in the base constructor or in the calling code. The former is preferable to the latter, but the consequences of failure in either case could be huge memory consumption occurring long after the code is send into the wild.
WCWedin
You've just stated that the implementation is inherent to the classes. The fact that a file is used is an implementation issue. The existing of a mapping is an implementation issue. These should not be part of the nature of these classes. I would recommend you factor this code out into a hierarchy of helper classes, called by the original classes.
John Saunders
A quick correction: When I said "C# requires an explicit call to the Initialize() method either in the base constructor or in the calling code," I of course meant to say the derived constructor. That aside, as I refactored the code to try out a few ideas, in pretty much all cases, all of loading and binding code naturally found its way into helper classes. In short, you're right that the base class had way too many responsibilities.
WCWedin
+1  A: 

Edit: I answered this for C++ for some reason. Sorry. For C# I recommend against a Create() method - use the constructor and make sure the objects stays in a valid state from the start. C# allows virtual calls from the constructor, and it's OK to use them if you carefully document their expected function and pre- and post-conditions. I inferred C++ the first time through because it doesn't allow virtual calls from the constructor.

Make the individual initialization functions private. The can be both private and virtual. Then offer a public, non-virtual Initialize() function that calls them in the correct order.

If you want to make sure everything happens as the object is created, make the constructor protected and use a static Create() function in your classes that calls Initialize() before returning the newly created object.

280Z28
+1 for the static Create method.
Fredrik Mörk
What's the point of a private virtual function - it cannot be overridden.
John Saunders
@John: Sorry. C# allows virtual calls from the constructor, and it's OK to use them if you carefully document their expected function and pre- and post-conditions. I inferred C++ the first time through because it doesn't allow virtual calls from the constructor.
280Z28
The Create() method, as I see it, is similar to a factory pattern. While it is quite a bit more limited, it has a much smaller footprint in terms of API size, and might meet my needs in this situation. You did recommend against it; is there some critical limitation I'm overlooking? It's a pattern I've used before (for much less important code) and it seemed to work alright. Convince me it's a bad idea, or I'll give it serious thought. :)
WCWedin
+1  A: 

In lots of cases, initialization stuff involves assigning some properties. It's possible to make those properties themselves abstract and have derived class override them and return some value instead of passing the value to the base constructor to set. Of course, whether this idea is applicable depends on the nature of your specific class. Anyway, having that much code in the constructor is smelly.

Mehrdad Afshari
An interesting idea, but unfortunately not applicable in my case. Thanks anyway!
WCWedin
+1  A: 

At first sight, I would suggest to move this kind of logic to the methods relying on this initialization. Something like

public class Base
{
   private void Initialize()
   {
      // do whatever necessary to initialize
   }

   public void UseMe()
   {
      if (!_initialized) Initialize();
      // do work
   }
}
jeroenh
You and ValdV had similar suggestions; arbitrarily, I responded to his: http://stackoverflow.com/questions/1155305/how-to-avoid-calling-viritual-methods-from-a-base-constructor/1155502#1155502 .
WCWedin
+4  A: 

I would consider creating an abstract factory that is responsible for instantiating and initializing instances of your derived classes using a template method for initialization.

As an example:

public abstract class Widget
{
    protected abstract void InitializeStep1();
    protected abstract void InitializeStep2();
    protected abstract void InitializeStep3();

    protected internal void Initialize()
    {
        InitializeStep1();
        InitializeStep2();
        InitializeStep3();
    }

    protected Widget() { }
}

public static class WidgetFactory
{
    public static CreateWidget<T>() where T : Widget, new()
    {
        T newWidget = new T();
        newWidget.Initialize();
        return newWidget;
    }
}

// consumer code...
var someWidget = WidgetFactory.CreateWidget<DerivedWidget>();

This factory code could be improved dramatically - especially if you are willing to use an IoC container to handle this responsibility...

If you don't have control over the derived classes, you may not be able to prevent them from offering a public constructor that can be called - but at least you can establish a usage pattern that consumers could adhere to.

It's not always possible to prevent users of you classes from shooting themselves in the foot - but, you can provide infrastructure to help consumers use your code correctly when they familiarize themselves with the design.

LBushkin
His question is on a clear way to implement a abstract class, not the creation of the derived versions.
eschneider
I disagree. He is looking for a pattern where derived classes will perform structured initialization - and he is concerned that those derivatives may not perform all the necessary initialization steps in the appropriate order. A factory that performs both construction and initialization can help resolve this problem. Particularly since the alternative of calling virtual methods in a constructor is (in general) an undesirable practice.
LBushkin
LBushkin is right about my needs. I'm not sure your suggestion is a literal implementation of an abstract factory, but I do like that generics take the burden off the implementer, in that there is no need for a DerivedFactory class. Asking the writer of Derived to make the constructor protected is pretty reasonably. My main concern would be the cost to the simplicity of the calling code. Moreover, for API consistency, I would probably want to move all object generation into factories. This might be beneficial overall, but it's something I'll have to consider.
WCWedin
quote "I'm trying to make it as easy as possible to properly implement a derivation of this class"
eschneider
+1  A: 

Since step 1 "grabs a file", it might be good to have Initialize(IBaseFile) and skip step 1. This way the consumer can get the file however they please - since it is abstract anyways. You can still offer a 'StepOneGetFile()' as abstract that returns the file, so they could implement it that way if they choose.

DerivedClass foo = DerivedClass();
foo.Initialize(StepOneGetFile('filepath'));
foo.DoWork();
s_hewitt
I like the idea of pushing separate objects into the initialization method; it removes the potential holes created by the Step1() and Step3() methods. It puts a little more burden on the derived class, but it makes it such that there's only one way to use the base class. The road is longer, but straighter. I think I would prefer to keep Initialize protected for my needs, but the principle is the same.
WCWedin
+1  A: 

You could employ the following trick to make sure that initialization is performed in the correct order. Presumably, you have some other methods (DoActualWork) implemented in the base class, that rely on the initialization.

abstract class Base
{
    private bool _initialized;

    protected abstract void InitilaizationStep1();
    private void InitilaizationStep2() { return; }
    protected abstract void InitilaizationStep3();

    protected Initialize()
    {
        // it is safe to call virtual methods here
        InitilaizationStep1();
        InitilaizationStep2();
        InitilaizationStep3();

        // mark the object as initialized correctly
        _initialized = true;
    }

    public void DoActualWork()
    {
        if (!_initialized) Initialize();
        Console.WriteLine("We are certainly initialized now");
    }
}
VladV
When I looked into using this pattern, I discovered that my class had too many entry points to make this approach clean and practical. In particular, there are a number of overloads for receiving binding information and retrieving that information in various formats. This in turn, suggested that I might want to refactor these responsibilities into a new class. Nice idea anyway, though perhaps a bit hacky. When multiple methods require proper initialization, doesn't factoring out the common code take you in the direction of a factory patten? Anyway, thanks for the advice and inspiration.
WCWedin
A: 

I would do something like this below:

public abstract class FileWorkProcessBase
{

    public FileWorkProcessBase()
    {
    }


    public void BeginProcessingFile()
    {

        GrabFile();

        InitalizeFile();

        ProcesFile();
    }

    protected abstract void GrabFile();

    private void InitalizeFile(){
        //Do common file initialization here...
    }

    protected abstract void ProcesFile();

}


public class ExampleFileWorkProcess : FileWorkProcessBase
{
    protected override void GrabFile()
    {
        throw new NotImplementedException();
    }

    protected override void ProcesFile()
    {
        throw new NotImplementedException();
    }
}
eschneider
Processing is moved out of the constructor (also you can easily identify when you start to process a file), sequence of events is controlled, simple for someone to create a derived version and do his own thing of step 1 (Grab file) and step 3 (Process File).
eschneider
Care to say why the down votes?
eschneider
I didn't downvote, but I wanted to ask you a few things: This doesn't look very different from John Saunders suggestion, which you disagreed with. From what I can tell, this still suffers from the possibility that the ExampleFileWorkProcess might call GrabFile() or ProcessFile() directly instead of InitializeFile(). Am I missing something? Also, is there some reason the implementations of the abstract methods throw exceptions, or is that just filler code?
WCWedin
John Saunders did not provide any code, he does have some good points, I also commented why. yes someone could call things out of order, but I would argue they would not do it by accidentally. The throws are the VS code generated by VS when to implement a abstract class to let the developer know what needs to be mplemented.
eschneider
A: 

I wouldn't do this. I generally find that doing any "real" work in a constructor ends up being a bad idea down the road.

At the minimum, have a separate method to load the data from a file. You could make an argument to take it a step further and have a separate object responsible for building one of your objects from file, separating the concerns of "loading from disk" and the in-memory operations on the object.

kyoryu