views:

256

answers:

13

This is a cross language question on coding style.

I have to work with a lot of code that has very long code blocks, sometimes hundreds of lines, inside if statements or for loops. The code is procedural.

Code like the following

if(condition){
    //hundreds of lines of code
}else if{
    //hundreds of lines of code
} else {
    //hundreds of lines of code
}

I have trouble navigating this code if I haven't seen it in a while because I constantly have to scroll back and forth to check which branch of the statement I'm in or whether I'm in a loop, or what the loop iterator is called.

My hunch is to put pieces of the long lines of code inside functions and call them from within the branches of the statements or inside the loops, so the loops and if trees are shorter and therefore more readable. I'd create functions that are sensible code-silos ofcourse, not just cut up the current code haphazardly.

But, and here's my question: is that a good idea? Or is it not a bad coding style to have hundreds of lines of code inside an if statement or a for loop? I'm not a really experienced programmer but I do appreciate clean code :)

Thanks!

Added: The hundreds of lines of code are not repetitive, most of the time. I understand and try to adhere to the DRY principle of course.

+6  A: 

From what you say, I'd refactor the contents of those blocks into separate procedures (giving them useful names and passing arguments as required).

if(condition){
    DoA();
}else if{
    DoB();
} else {
    DoC();
}

procedure DoA() {
    //hundreds of lines of code
}

procedure DoB() {
    //hundreds of lines of code
}

procedure DoC() {
    //hundreds of lines of code
}

Of course, it may be useful to do this inside of the DoX-procedures as well.

Lucero
Agree totally. It sounds like you haven't broken down common tasks into an appropriate granularity.
Thomas Owens
disagree. Putting the hundreds of lines in a single function is probably not the best way to go, depending on the nature of the code. It may be best to have instead of a single call to a hundreds of lines function, one or two or three function calls with small amount of lines each (if it makes sense, which usually does)
Vinko Vrsalovic
The if-blocks should only call a single function each. However, the DoX functions should again be split into more functions. You should never have hundreds of lines of code inside a single procedure - ever. Split them into all logical parts such that the code gets readable without having to think too much..
stiank81
As Vinko said automatically splitting into functions isn't always best answer - good indicator is when you can't find names for these new functions. First check repetitive parts of code that can be broken into meaningful functions(you can even use tool like ccfinder to find similar code).
MaR
@bambuska: So you are advocating that every if (condition) { ... } should have a single function call inside? I obviously disagree, you should have as many function calls as it makes sense.
Vinko Vrsalovic
@Vinko: Not necessarily always, but I often find that being best practice. This mostly enhances increased readability. Again - not always, but for the case described here it sounds like he needs to do "a lot" and this "a lot" can probably be described by a name.
stiank81
@bambuska: It could make sense, but we don't have enough information to be sure. The general case is to split the lines in as many functions as required, and if the whole has a **single purpose** you could then group all the smaller functions in a single function called by the if(). Just wrap it in a name that has 5 purposes does no good.
Vinko Vrsalovic
Actually, I'm not describing a concrete case, more a coding style that is used in the legacy code I work with. And I tend to agree with Vinko Vrsalovic more in that I'd rather have 5 readable function calls per branch of the if statement than 1 function call that's 'artificially' put together.
Niels Bom
+4  A: 

It is awful coding style to have hundreds of lines in a single function. Have small functions, hopefully that fit in a single screen and that serve a single purpose, so that your code ends looking up like

if (condition) {
    func1(); //One purpose
    func2(); //Another purpose
    func3();
} else if (another condition) {
    func4();
} else {
    func5();
    func6();
    func3(); //wow I can reuse some code!
}

Don't go on the easy route to put the whole hundreds of lines of each case in a single function either. Try to isolate purposes (this function create a temporary file, for example) and assign functions to them. You will probably be surprised to see your total line count to go down, as there is very likely repeated functionality in those hundreds of lines.

Vinko Vrsalovic
This answer is very similar to ChrisF's (imho) and equally good, too bad I can't give two accepted answers.
Niels Bom
+9  A: 

It's generally a good idea to be able to see the whole of a method on one screen. If you have to scroll up and down too much (some would argue at all) then you run the risk of missing information.

So in general what you're proposing is a good idea. Though you needn't go as far as putting all the code in one branch into one method. There might be repeated sections of code so you could break it up as follows:

if(condition){
    CommonMethod();
    SpecificMethodA();
}else if{
    CommonMethod();
    SpecificMethodB();
} else {
    CommonMethod();
    SpecificMethodC();
}

as an example. Though the exact nature of the refactoring will depend totally on your code.

ChrisF
+1 This is what I would have said. I would look at the hundreds of lines of code to see if I could reduce the amount of duplicate code
SwDevMan81
A: 

Procedures/Function are used to replace repetative code, where you only require minor details/values to change. This reduces the redundancy, and maintinance of the code. But yes you can also use it (with meaningfull names) to reduce such big code blocks.

astander
A: 

In addtion to the other answers: If you want to know more about this, try reading "Clean Code" by Robert C. Martin.

nfechner
I have that book, it's right behind a couple of other books on my reading list :)
Niels Bom
+1  A: 

In my opinion as a long time coder and code reviewer, the answer (as usual) is sometimes. Here are reasons I would split the code into separate functions:

  1. The code would be more understandable if assigned function names. I find that comments are still rare, and creating a function name to describe a block of code is one way to document what the block of code does.
  2. The parent function (as you note) would be more manageable if it weren't so long.
  3. If the function can be designed intelligently, it might be useful elsewhere, or even help you re-factor the existing code into a smaller amount of code.

But there are also reasons to keep the code in the initial function:

  1. It might be more of a pain to navigate back and forth to the child functions and back if the internal code is highly interactive with the parent function code.
  2. The code could be less readable and maintainable if the internal code is highly interactive with the parent function code, having to pass many parameters to maintain the state of what the child function needs to process.
  3. The code might not be unreadable or unmanageable in the first place if the internal code is relatively repetitive and simple. But keep in mind that if it's repetitive and simple, that might also make it a good candidate for re-factoring for other reasons.
BlueMonkMN
1.- If it is a pain to navigate, you are not using the right tools 2.- If you have that many parameters in the parent, you already have that problem 3.- why would you NOT refactor code into a function if it is repetitive?? That's what functions are for!
Vinko Vrsalovic
Passing around a lot of parameters is one of the possible disadvantages I see too. I don't really understand the relevance of the 3rd disadvantage you list. Repetitive code can be put into a function, DRY principle, clearly. But if a code block has a huge number of small and simple steps, it's still hard to navigate, right?
Niels Bom
Vinko, 1. Since this question is not specific to a language, it's possible it would apply to some languages where nice tools or environments don't exist. But generally I agree. 2. I'm talking about local variables used by the inner code. Every local variable referenced by the inner code would have to be turned into a parameter. 3. Example: if (x.P1==1) x.p="P1"; if (x.P2==1) x.p="P2"; ...; if (x.P49==1) x.p="49";. It might not make much sense to re-factor that repetitive code into a function. NOTICE, I *DID* say that simple/repetitive code should sometimes be re-factored for other reasons.
BlueMonkMN
Niels, I seem to have stirred up some confusion in my comment about not refactoring simple, repetitive code. In many cases you do want to re-factor it, but in some cases it can't be collapsed into something simpler without complicating the code. I think I would rather see this: if (x.P1==1) x.p="P1"; if (x.P2==1) x.p="P2"; ...; if (x.P49==1) x.p="49"; instead of some function that performs reflection. In this example, this should probably be a function of x, but I imagine other cases might not lend themselves to OO refactoring so easily and would be better left in-line.
BlueMonkMN
+1  A: 

It's not a good coding style.
Breaking the code into smaller procedures makes it much more readable and maintanable.

Having said that. Don't immediately go and start cutting these up into smaller procedures.
Only change what you are reading regularily, or which you need to update.

Remember, every time you change code you need to test your change, so don't change needlessly, as every single time you change code, there is a potential to introduce a new bug.

Bravax
A: 

You should really consider decomposition of such a blocks into classes. OOP provides you with all techniques required. Template method pattern for example solves a problem with presence of CommonAction among condition-specific actions. This is just the tip of OOP iceberg.

archimed7592
Are you aware of effort required to convert mammoth procedural code into OOP equivalent? If you factor that in, is there real benefit?
MaR
If the code is going to be modified heavily a lot of times, it may be worth it in a longer term.
ya23
I'm aware of what you are calling an effort. This effort won't be required if you would have preferred writing the code in OOP style instead of mammoth-like style. If so there wouldn't be any need in converting such a mammoth constructions in something usable.
archimed7592
Doing things better from the start is an obvious choice. But the problem lies in the big amounts of procedural code. If I put in months and months of work to make the code OO but the functionality itself does not change, who is going to pay for my work? If there was sufficient time and money I would probably rewrite it all in OO, but that is not the case.
Niels Bom
+2  A: 

Have this in mind: You - or anyone else reading the code - should understand each function easily when reading it. If the function contains hundreds of lines of code it will never be easily understandable. If you find yourself making notes when reading to understand what the code does it needs refactoring.

The example you're showing is definitely of the kind that should be refactored. Each block of the if-clause should ideally call one function only.

if (condition) 
{ 
    doSomething(); 
} 
else if (anotherCondition) 
{   
    doThisOtherThing();
}
else
{   
    doThisThirdThing();
}

You see how easily understandable the code is? There is really nothing to understand there - except parhaps the condition, but if this is complex it should be factored out as a separate function too.

Now, the doSomething(), doThisOtherThing() and doThisThirdThing() function should not be the plain chunks of hundres lines of code. No function should ever be that long. It will always be difficult to understand that long functions. Try finding logical parts of your functions that can be factored out as separate functions and do so. If there are common parts that should be done in all of the cases a common function should be created and called from all of the doX-functions.

Note: You don't need code reuse to create a function. You can factor out something in a new function just to make the code more readable.

Good luck refactoring! :-)

stiank81
+1  A: 

I once worked in a project where there was an official limit of 20 lines of code per function. It was hell!

I might be working with masks, moving data in or out of GUI elements, and then I'd be forced to write

transferFirst10Elements()
transferElements11Thru20()
transferElements21Thru30()
transferElements31Thru40()

and railed against our policy as an example of dogma taken too far.

Just to clarify, I think that initial function should indeed be broken up, but I don't agree with some other answerers that the breakdown should continue until there are no more fragments bigger than a screenful. There comes a point where managing zillions of tiny functions is more painful than managing somewhat fewer larger ones.

Carl Smotricz
Agreed, there's a balance in the use of any "rule".
Niels Bom
Yes, you should **aim to** have one function per screen, but of course that shouldn't be an enforceable rule. I'm horrified to know that people has actually done that.
Vinko Vrsalovic
+1  A: 

The fact that you are experiencing problems whenever you revisit the code is clear enough indication that there is a problem.

Contrary to most of the the other suggestions here, I'd take a slightly different approach.

Trying to follow some of the suggested rules about where and how much to split may work, but it can also lead to situations where the breakdown itself is unmanageable. Instead of scrolling up and down to familiarise yourself again, you'd just be navigating to a bunch of smaller routines scattered throughout your source files.

Instead...

I suggest you try to focus on splitting your code into a number of methods where each method performs one task which is clearly indicated by the name of the method. Code blocks like these are usually large because they are doing multiple things. By splitting them in this way, you'll find that your code will naturally evolve into a more manageable style with less risk of the problem mentioned before.

It may also help to work in the other direction. Take one of your larger code blocks, try to (while ignoring the actual code) define what it does in a number of simpler steps, then code it accordingly. Remember, it must do one thing; but may require multiple steps to get the job done. This will provide natural breakdown of the large code-block.

NOTE: I would never advocate imposing line limits on routines that perform a single task; for example a routine to 'Populate Edit Controls' may be many lines because of the number of the number of controls, but the mistake some people make is to do 'simple calculations' in addition to populating the controls.

An additional benefit to this approach is that you're more likely to have routines that can be reused because they don't have 'excess baggage'.

Craig Young
In short, write simple clearly-named functions that do only one thing, right? A function that does "Populate Edit Controls" might be hundreds of lines itself, but if it can't logically be divided further, I don't think it should be, just for the sake of having short functions.
Niels Bom
@Niels Exactly! **clearly-named** and **one thing** are the key objectives, and there are a number of techniques with which one can try to achieve them.
Craig Young
A: 

I'd second the template/pattern suggestion by archimed (sorry I'd vote you up, but my rep isn't that high yet apparently). I was thinking of a Strategy pattern myself, but Template would work out just as well.

I'm actually really surprised that it got voted down. Having done it both ways (via moving the code into methods versus implementing a pattern) and I've found that the amount of work to implement a pattern for those chunks of code is really not that much different. And as ya23 said, the long term benefits of having the pattern implemented are pretty substantial.

Arthur C
A: 

Your idea of extracting the long code fragments inside of the "if" blocks is on the right track; however, I would start much smaller than that. Look for smaller bits of repeated code - I'd say 3-10 lines - that do something very understandable, and extract that to a method (which should be very easy to name). If possible, do the extracted using automated refactoring tools (to avoid errors) and add unit tests for the extracted method. Now, repeat this a couple of times, and you should see patterns and structures begin to emerge. You'll likely end up with a structure similar to that of a previous answer:

if (condition) {
    func1(); 
    func2(); 
    func3();
} else if (another condition) {
    func4();
} else {
    func5();
    func6();
    func3();
}
glaxaco