views:

242

answers:

9

Many people have argued about function size. They say that functions in general should be pretty short. Opinions vary from something like 15 lines to "about one screen", which today is probably about 40-80 lines.
Also, functions should always fulfill one task only.

However, there is one kind of function that frequently fails in both criteria in my code: Initialization functions.

For example in an audio application, the audio hardware/API has to be set up, audio data has to be converted to a suitable format and the object state has to properly initialized. These are clearly three different tasks and depending on the API this can easily span more than 50 lines.

The thing with init-functions is that they are generally only called once, so there is no need to re-use any of the components. Would you still break them up into several smaller functions would you consider big initialization functions to be ok?

+5  A: 

If breaking into smaller parts makes code better structured and/or more readable - do it no matter what the function does. It not about the number of lines it's about code quality.

sharptooth
+2  A: 

In a situation like this I think it comes down to a matter of personal preference. I prefer to have functions do only one thing so I would split the initialization into separate functions, even if they are only called once. However, if someone wanted to do it all in a single function I wouldn't worry about it too much (as long as the code was clear). There are more important things to argue about (like whether curly braces belong on their own separate line).

TLiebe
+1  A: 

If you have a lot of components the need to be plugged into each other, it can certainly be reasonably natural to have a large method - even if the creation of each component is refactored into a separate method where feasible.

One alternative to this is to use a Dependency Injection framework (e.g. Spring, Castle Windsor, Guice etc). That has definite pros and cons... while working your way through one big method can be quite painful, you at least have a good idea of where everything is initialized, and there's no need to worry about what "magic" might be going on. Then again, the initialization can't be changed after deployment (as it can with an XML file for Spring, for example).

I think it makes sense to design the main body of your code so that it can be injected - but whether that injection is via a framework or just a hard-coded (and potentially long) list of initialization calls is a choice which may well change for different projects. In both cases the results are hard to test other than by just running the application.

Jon Skeet
+12  A: 

I would still break the function up by task, and then call each of the lower level functions from within my public-facing initialize function:

void _init_hardware() { }
void _convert_format() { }
void _setup_state() { }

void initialize_audio() {
    _init_hardware();
    _convert_format();
    _setup_state();
}

Writing succinct functions is as much about isolating fault and change as keeping things readable. If you know the failure is in _convert_format(), you can track down the ~40 lines responsible for a bug quite a bit faster. The same thing applies if you commit changes that only touch one function.

A final point, I make use of assert() quite frequently so I can "fail often and fail early", and the beginning of a function is the best place for a couple of sanity-checking asserts. Keeping the function short allows you to test the function more thoroughly based on its more narrow set of duties. It's very hard to unit-test a 400 line function that does 10 different things.

meagar
+1 for `assert()` alone.
ndim
+1: "there is no need to re-use any of the components." Re-use isn't the issue. Writing something that can be understand and maintained by other people is far, far more important.
S.Lott
I remember a piece of advise about leaving identifiers starting with underscores to the C compiler's internal use and avoiding them in programs. Also, you should mark those three once-off init functions as `static`. For once, they are not going to be used outside the current source file. And as an additional benefit, a smart compiler will then see that they are only called once, and just inline the code (just in case you are worried about that calling overhead).
ndim
@ndim This was actually my attempt to write pseudocode :p
meagar
+3  A: 

I would still try to break up the functions into logical units. They should be as long or as short as makes sense. For example:

SetupAudioHardware();
ConvertAudioData();
SetupState();

Assigning them clear names makes everything more intuitive and readable. Also, breaking them apart makes it easier for future changes and/or other programs to reuse them.

Paul Williams
+1  A: 

First, a factory should be used instead of an initialization function. That is, rather than have initialize_audio(), you have a new AudioObjectFactory (you can think of a better name here). This maintains separation of concerns.

However, be careful also not to abstract too early. Clearly you do have two concerns already: 1) audio initialization and 2) using that audio. Until, for example, you abstract the audio device to be initialized, or the way a given device may be configured during initialization, your factory method (audioObjectFactory.Create() or whatever), should really be kept to just one big method. Early abstraction serves only to obfuscate design.

Note that audioObjectFactory.Create() is not something that can be unit-tested. Testing it is an integration test, and until there are parts of it that can be abstracted, it will remain an integration test. Later on, you may find that the you have multiple different factories for different configurations; at that point, it might be beneficial to abstract the hardware calls into an interface, so you that you can create unit tests to ensure the various factories configure the hardware in a proper way.

Sam Pearson
Actually, that is pretty much what I am doing. The class itself is supposed to be an audio player object that abstracts the handling of the mess that is CoreAudio. Really, audio data preparation and audio hardware initialization are very tightly connected since different audio data demands a different configuration and vice versa.
BastiBechtold
+1  A: 

I think it's the wrong approach to try and count the number of lines and determine functions based on that. For something like initialization code I often have a separate function for it, but mostly so that the Load or Init or New functions aren't cluttered and confusing. If you can separate it into a few tasks like others have suggested, then you can name it something useful and help organize. Even if you are calling it just once, it's not a bad habit, and often you find that there are other times when you may want to re-init things and can use that function again.

Thyamine
+1  A: 

Just thought I'd throw this out there, since it hasn't been mentioned yet - the Facade Pattern is sometimes cited as an interface to a complex subsystem. I haven't done much with it myself, but the metaphors are usually something like turning on a computer (requires several steps), or turning on a home theater system (turn on TV, turn on receiver, turn down lights, etc...)

Depending on the code structure, might be something worth considering to abstract away your large initialization functions. I still agree with meagar's point though that breaking down functions into _init_X(), _init_Y(), etc. is a good way to go. Even if you aren't going to reuse comments in this code, on your next project, when you say to yourself, "How did I initialize that X-component?", it'll be much easier to go back and pick it out of the smaller _init_X() function than it would be to pick it out of a larger function, especially if the X-initialization is scattered throughout it.

awshepard
+1  A: 

Function length is, as you tagged, a very subjective matter. However, a standard best-practice is to isolate code that is often repeated and/or can function as its own entity. For instance, if your initialization function is loading library files or objects that will be used by a specific library, that block of code should be modularized.

With that said, it's not bad to have an initialization method that's long, as long as it's not long because of lots of repeated code or other snippets that can be abstracted away.

Hope that helps,
Carlos Nunez

Carlos Nunez