views:

1559

answers:

18

Should operations that could take some time be performed in a constructor or should the object be constructed and then initialised later.

For example when constructing an object that represents a directory structure should the population of the object and its children be done in the constructor. Clearly, a directory can contain directories and which in turn can contain directories and so on.

What is the elegant solution to this?

+12  A: 

You usually do not want the constructor to do any computation. Someone else using the code will not expect that it does more than basic set-up.

For a directory tree like you're talking about, the "elegant" solution is probably not to build a full tree when the object is constructed. Instead, build it on demand. Someone using your object may not really care what is in sub-directories, so start out by just having your constructor list the first level, and then if someone wants to descend into a specific directory, then build that portion of the tree when they request it.

SoapBox
+2  A: 

As for how much work should be done in the constructor, I'd say it should take into account how slow things are, how you are going to use the class, and generally how you feel about it personally.

On your directory structure object: I recently implemented a samba (windows shares) browser for my HTPC, and as that was incredibly slow I opted to only actually initialize a directory when it was touched. e.g. First the tree would consist of just a list of machines, then whenever you browse into a directory the system would automatically initialize the tree from that machine and get the directory listing one level deeper, and so on.

Ideally I think you could even take it as far as writing a worker thread that scans the directories breadth-first, and would give priority to the directory you're currently browsing, but generally that's just too much work for something simple ;)

Frans-Willem
+11  A: 

The time required should not be a reason not to put something into a constructor. You could put the code itself into a private function, and call that out of your constructor, just to keep the code in the constructor clear.

However, if the stuff you want to do is not required to give the object a defined condition, and you could do that stuff later on first use, this would be a reasonable argument to put it out and do it later. But don't make it dependant on the users of your class: These things (on-demand initialization) must be completely transparent to users of your class. Otherwise, important invariants of your object might easily break.

Johannes Schaub - litb
Litb, I don' believe you are suggesting that arbitrarily long things - like enumerating a directory tree should go in a constructor.... right?
Foredecker
Foredecker: It's about the semantics of the class: a MutexLocker could wait for an arbitrary long time until a resource becomes available, since that is what constitutes the invariant of the MutexLocker (the time the destructor is called = resource owned).
Johannes Schaub - litb
This is something I can implement. Right now, I have logic in my constructor which I wish to removed.
Thorpe Obazee
A: 

It really depends on the context, i.e. the problem that the class must solve. Should it for example always be able to show the current children inside itself? If the answer is yes, then the children should not be loaded in the constructor. On the other hand, if the class represents a snapshot of a directory structure then it can be loaded in the constructor.

Jonas Kongslund
+7  A: 

It depends (typical CS answer). If you are constructing objects at startup for a long-running program, then there is no problem with doing a lot of work in constructors. If this is part of a GUI where fast response is expected, it might not be appropriate. As always, the best answer is to try it the simplest way first, profile, and optimize from there.

For this specific case, you can do lazy construction of the sub-directory objects. Only create entries for the names of the top level directories. If they are accessed, then load the contents of that directory. Do this again as the user expolres the directory structure.

KeithB
A: 

Try to have what you think is necessary there and dont think about if it will be slow or fast. Preoptimization is a time waster so code it, profile it and optimize it if needed.

pergoransson
There is a -huge- difference between "pre-optimization" and a architectural or design choice that is a performance bug. Performance is a feature - just like fancy UI, correctness, maintainability, and reliability. Architect and design for it.
Foredecker
Design for performance, but don't overdesign. Making your design overly complex because you think something may be slow can be even worse. The best advice is to keep your design flexible so you can fix problems when they occur. This also lets you meet changing requirements.
KeithB
+29  A: 

To summarize:

  • At a minimum, your constructor needs to get the object configured to the point that its invariants are true.

  • Your choice of invariants may affect your clients.(Does the object promise to be ready for access at all times? Or only only in certain states?) A constructor that takes care of all of the set-up up-front may make life simpler for the class's clients.

  • Long-running constructors are not inherently bad, but may be bad in some contexts.

  • For systems involving a user-interaction, long-running methods of any type may lead to poor responsiveness, and should be avoided.

  • Delaying computation until after the constructor may be an effective optimization; it may turn out to be unnecessary to perform all the work. This depends on the application, and shouldn't be determined prematurely.

  • Overall, it depends.

Oddthinking
+3  A: 

For the sake of code maintenance, testing, and debugging I try to avoid putting any logic in constructors. If you prefer to have logic execute from a constructor then it is helpful to put the logic in a method such as init() and call init() from the constructor. If you plan on developing unit tests you should avoid putting any logic in a constructor since it may be difficult to test different cases. I think the previous comments already address this, but... if your application is interactive then you should avoid having a single call that leads to a noticeable performance hit. If your application is non-interactive (ex: nightly batch job) then a single performance hit isn't as big a deal.

rich
+4  A: 

The most important jobs of a constructor is to give the object an initial valid state. The most important expectation on the constructor, in my opinion, is that the constructor should have NO SIDE EFFECTS.

Hugo
"No Side Effects" is too strong. For example:The constructor for a semaphore may block other instances from completing their construction. That side-effect is its job.Another constructor might automatically assign the object a unique id from a (global or class-owned) incrementing counter. That's a legitimate side-effect, too.
Oddthinking
agree with @Oddthinking. Another example: if you have a constructor for something like `ofstream` do you still expect "NO SIDE EFFECTS"?
João Portela
A: 

As much as necessary and no more.

The constructor must put the object into a usable state, hence at a minimum your class variables ought to be initted. What initted means can have a broad interpretation. Here is a contrived example. Imagine you have a class that has the responsibility of providing N! to your calling application.

One way to implement it would be to have the constructor do nothing, with a member function with a loop that calculates the value needed and returns.

Another way to implement it would be to have an class variable which is an array. The constructor would set all the values to -1, to indicate that the value has not been calculated yet. the member function would do lazy evaluation. It looks at the array element. If it is -1, it calculates it and stores it and returns the value, otherwise it just returns the value from the array.

Another way to implement it would be just like the last one, only the constructor would precalculate the values, and populate the array, so the method, could just pull the value out of the array and return it.

Another way to implement it would be to keep the values in a text file, and use N as the basis for an offset into the file to pull the value from. In this case, the constructor would open the file, and the destructor would close the file, while the method would do some sort of fseek/fread and return the value.

Another way to implement it, would be to precompute the values, and store them as a static array that the class can reference. The constructor would have no work, and the method would reach into the array to get the value and return it. Multiple instances would share that array.

That all being said, the thing to focus on, is that generally you want to be able to call the constructor once, then use the other methods frequently. If doing more work in the constructor means that your methods have less work to do, and run faster, then it is a good trade off. If you are constructing/destructing a lot, like in a loop, then it is probably not a good idea to have a high cost for your constructor.

EvilTeach
A: 

Make sure the ctor does nothing that could throw an exception.

EricSchaefer
Could you please comment when downvoting? Why do you think my answer deserves it?
EricSchaefer
Using C++ with RAII model, it's expected that ctors will throw exceptions (eg, if a file doesn't exist). Usually, it is dtors that shouldn't throw exceptions. C++ defines constructor ordering so this is well defined.
Procedural Throwback
Even if C++ allows it does not mean it is good practise to have a constructor throw exceptions.
Aydya
So my answer is wrong for one pattern and still true for the rest. Throwing exceptions in ctors is usually not a good idea and considered bad practise.
EricSchaefer
MSalters
If you don't throw an exception in a constructor, how else do you tell the code that created the object that something went wrong?
KeithB
The ctor is for initializing your object. Alloc memory for your structures, set initial values and the like. There aren't very much reasons to throw an exception except for OOM. It is considered bad practice to put any logic in the ctor.
EricSchaefer
EricShaefer - could you provide your sources for your 'considered bad practice' comment?
Len Holgate
@Len Holgate: http://herbsutter.wordpress.com/2008/07/25/constructor-exceptions-in-c-c-and-java/If your ctor does more than just simple initialization you might not be able to instantiate the class in isolation (think unit testing and reusability for instance).
EricSchaefer
+2  A: 

I would agree that long running constructors are not inherently bad. But I would argue that thy are almost always the wrong thing to do. My advice is similar to that from Hugo, Rich, and Litb:

  1. keep the work you do in constructors to a minimum - keep the focused on initializing state.
  2. Don't throw from constructors unless you cannot avoid it. I try to only throw std::bad_alloc.
  3. Don't call OS or Library APIs unless you know what they do - most can block. They will run quickly on your dev box and test machines, but in the field they can be blocked for long periods of time as the system is busy doing something else.
  4. Never, ever do I/O in a constructor - of any kind. I/O is commonly subject to all kinds of very long latencies (100's of milliseconds to seconds). I/O includes
    • Disk I/O
    • Anything that uses the network (even indirectly) Remember most resources can be off box.

I/O problem example: Many hard disks have a problem where they get into a state where they do not service reads or writes for 100's or even thousands of milliseconds. The first and generation solid state drives do this often. The user has now way of knowing that your program jus hung for a bit - they just think it is your buggy software.

Of course, the evilness of a long running constructor is dependent on two things:

  1. What 'long' means
  2. The how often in a given period objects with the 'long' constructors are constructed.

Now, if 'long' is simply a few 100 extra clock cycles of work, then its not really long. But a constructor is getting into the 100's of microsecond range the I suggest it is pretty long. Of course, if you are only instantiating one of these, or instantiating them rarely (say one every few seconds) then you are not likely to see problms due to a duration in this range.

Frequency is an important factor, a 500 us ctor isn't a problem if you are only building a few of them: but creating a million of them would pose a significant performance problem.

Lets talk about your example: populating a tree of directory objects inside "class Directory" object. (note, I'm going to assume this is a program with a graphical UI). Here, your CTOR duration isn't dependent on the code you write - its defendant on the time it takes to enumerate an arbitrarily large directory tree. This is bad enough on local hard drive. Its even more problematic on remote (networked) resurce.

Now, imagine doing this on your user interface thread - your UI will stop dead in its tracks for seconds, 10's of seconds or potential even minutes. In Windows we call this a UI hang. They are bad bad bad (yes we have them... yes we work hard to eliminate them).

UI Hangs are something that can make people really hate your software.

The right thing to do here is simply initialize your directory objects. Build your directory tree in a loop that is can be canceled and keeps your UI in a responsive state ( the cancel button should always works)

Foredecker
This advice is completely appropriate when designing GUI apps, but not everything fits into this mold. I write large, large-running number-crunching code, and it this advice is just wrong in that context.
KeithB
I don't disagree - I was pretty clear that my comments are primarily applicable to UI and code that needs to be responsive. OF course, no everything fits into one mold - there is rarely "the one true answer".
Foredecker
A: 

Arrays of objects will always use the default (no-arguments) constructor. There's no way around that.

There are "special" constructors: The copy constructor and operator=().

You can have a lot of constructors! Or wind up with a lot of constructors later on. Every now and then Bill out in la-la land wants a new constructor with floats rather than doubles to save those 4 lousy bytes. (Buy some RAM Bill!)

You can't call the constructor like you can an ordinary method to re-invoke that initialization logic.

You can't make the constructor logic virtual, and change it in a subclass. (Though if you are invoking an initialize() method from the constructor rather than manually, virtual methods won't work.)

.

All these things create a lot of grief when significant logic exists in the constructor. (Or at least duplication of code.)

So I, as a design choice, prefer to have minimal constructors that (optionally, depending on their parameters and the situation) invoke an initialize() method.

Depending on the circumstances, initialize() may be private. Or it may be public and support multiple invocations (e.g. re-initializing).

.

Ultimately, the choice here varies based on the situation. We have to be flexible, and consider the tradeoffs. There is no one-size-fits-all.

The approach we'd use to implement a class with single solitary instance that's using threads to talk to a piece of dedicated hardware and that has to be written in 1/2 an hour is not necessarily what we'd use to implement of a class representing mathematics on variable-precision floating-point numbers written over many months.

Mr.Ree
+1  A: 

If something can be done outside a constructor, avoid doing it inside. Later, when you know your class is otherwise well-behaved, you might risk doing it inside.

Aydya
+1  A: 

RAII is the backbone of C++ resource management, so acquire the resources you need in the constructor, release them in the destructor.

This is when you establish your class invariants. If it takes time, it takes time. The fewer "if X exists do Y" constructs you have, the simpler the rest of the class will be to design. Later, if profiling shows this to be a problem, consider optimizations like lazy initialization (acquiring resources when you first need them).

christopher_f
A: 

I vote for thin constructors, and adding an extra "uninitialized" state behavior to your object in that case.

The reason: if you do not, you impose all your users to either have heavy constructors too, or, to allocate your class dynamically. In both cases it may be seen as a hassle.

It may be hard to catch errors from such objects if they become static, because constructor then runs before main() and is more difficult for a debugger to trace.

Pavel Radzivilovsky
A: 

Excellent question: the example you gave where a 'Directory' object has references to other 'Directory' objects is also a great example.

For this specific case I would move the code to build up subordinate objects out of the constructor (or maybe do the first level [immediate children] as another post here recommends), and have a separate 'initialize' or 'build' mechanism).

There is another potential issue otherwise - beyond just performance - that is memory-footprint: If you end up making very deep recursive calls, you will likely end up with memory problems as well [since the stack will be keeping copies of all the local variables until the recursion finishes].

monojohnny
A: 

Historically, I have coded my constructors so that the object is ready to use once the constructor method is complete. How much or how little code is involved depends on the requirements for the object.

For example, let's say I need to display the following Company class in a details view:

public class Company
{
    public int Company_ID { get; set; }
    public string CompanyName { get; set; }
    public Address MailingAddress { get; set; }
    public Phones CompanyPhones { get; set; }
    public Contact ContactPerson { get; set; }
}

Since I want to display all the information I have about the company in the details view, my constructor will contain all of the code necessary to populate every property. Given that is a complex type, the Company constructor will trigger the execution of the Address, Phones and Contact constructor as well.

Now, if I am populating a directory listing view, where I may only need the CompanyName and the main phone number, I may have a second constructor on the class that only retrieves that information and leaves the remaining information empty, or I may just create a separate object that only holds that information. It really just depends on how the information is retrieved, and from where.

Regardless of the number of constructors on a class, my personal goal is to do whatever processing is necessary to prepare the object for whatever tasks may be imposed upon it.

Neil T.