views:

356

answers:

2

I was doing a little exploring of a legacy system I maintain, with NDepend (great tool check it out), the other day. My findings almost made me spray a mouthful of coffee all over my screen. The top 3 functions in this system ranked by descending cyclomatic complexity are:

  1. SomeAspNetGridControl.CreateChildControls (CC of 171!!!)
  2. SomeFormControl.AddForm (CC of 94)
  3. SomeSearchControl.SplitCriteria (CC of 85)

I mean 171, wow!!! Shouldn't it be below 20 or something? So this made me wonder. What is the most complex function you maintain or have refactored? And how would you go about refactoring such a method?

Note: The CC I measured is over the code, not the IL.

+3  A: 

This is kid stuff compared to some 1970s vintage COBOL I worked on some years ago. We used the original McCabe tool to graphically display the CC for some of the code. The print out was pure black because the lines showing the functional paths were so densely packed and spaghetti-like. I don't have a figure but it had to be way higher than 171.

What to do

Per Code Complete (first edition):

If the score is:

  • 0-5 - the routine is probably fine
  • 6-10 - start to think about ways to simplify the routine
  • 10+ - break part of the routine into a second routine and call it from the first routine

Might be a good idea to write unit tests as you break up the original routine.

Andrew Cowenhoven
Good answer! Also consider picking up a copy of Refactoring.http://www.amazon.com/Refactoring-Improving-Design-Existing-Code/dp/0201485672See also http://www.refactoring.com/
TrueWill
I read both Refactoring and Code Complete and have them right here by my side along with The Pragmatic Programmer. I was just curious as to how the community tackles this problem.
JohannesH
@Andrew: Hey, no COBOL, thats cheating... ;)
JohannesH
I remember a talk buy some guy promoting a coverage tool. He ran it on the JDK, and there were methods which had either a cyclomatic complexity or NPath complexity (I forget which one) with 56 decimal digits. IOW: in order to even remotely test this method, you would a number of unit tests, encroaching slowly on the number of particles in the known universe, estimated to be somewhere between 10<sup>70</sup> and 10<sup>90</sup>.
Jörg W Mittag
+2  A: 

This is for C/C++ code currently shipping in a product:

the highest CC value that I could reliably identify (i.e. I don't suspect the tool is erroneously adding complexity values for unrelated instances of main(...) ):

  • an image processing function: 184
  • a database item loader with verification: 159

There is also a test subroutine with CC = 339 but that is not strictly part of the shipping product. Makes me wonder though how one could actually verify the test case(s) implemented in there...

and yes, the function names have been suppressed to protect the guilty :)

How to change it:

There is already an effort in place to remedy this problem. The problems are mostly caused by two root causes:

  1. spaghetti code (no encapsulation, lots of copy-paste)
  2. code provided to the product group by some scientists with no real software construction/engineering/carpentry training.

The main method is identifying cohesive pieces of the spaghetti (pull on a thread:) ) and break up the looooong functions into shorter ones. Often there are mappings or transformations that can be extracted into a function or a helper class/object. Switching to using STL instead of hand-built containers and iterators can cut a lot of code too. Using std::string instead of C-strings helps a lot.

LaszloG