views:

206

answers:

4

I was looking at some code length metrics other than Lines of Code. Something that Source Monitor reports is statements. This seemed like a valuable thing to know, but the way Source Monitor counted some things seemed unintuitive. For example, a for statement is one statement, even though it contains a variable definition, a condition, and an increment statement. And if a method call is nested in an argument list to another method, the whole thing is considered one statement.

Is there a standard way that statements are counted and are their rules governing such a thing?

+6  A: 

The first rule of metrics is "be careful what you measure". You ask for a count of statements, that's what you're going to get. As you note, that figure is perhaps not actually relevant.

If you're interested in other measures, like how "complex" code is, consider looking into other code metrics, like cyclometric complexity.

http://en.wikipedia.org/wiki/Cyclomatic_complexity

UPDATE: Re: your comment

I agree that "doing too much" is an interesting metric. My rule of thumb is that one statement should have one side effect (usually a "local" side effect like mutating a local variable, but sometimes a visible side effect, like writing to a file) and therefore "number of statements" should be roughly correlated with how much the method is "doing" in terms of its number of side effects.

In practice, of course no one's code, my own included, actually meets that bar all the time. You might consider a metric for "how much the method is doing" to count not just statements but also, say, method calls.

To actually answer your question: I'm not aware of any industry standard that regulates what "number of statements" is. The C# specification certainly defines what a "statement" is lexically, but then of course you have to do some interpretation to do a count. For example:

  void M()
  {
    try
    {
      if (blah)
      {
        Frob();
        Blob();
      }
    }
    catch(Exception ex)
    { /* eat it */ }
    finally
    {
      Grob();
    }
  }

How many statements are there in M? Well, the body of M consists of one statement, a try-catch-finally. So is the answer one? The body of the try contains one statement, an "if" statement. The consequence of the "if" contains one statement -- remember, a block is a statement. The block contains two statements. The finally contains one statement. The catch block contains no statements -- a catch block is not a statement, lexically -- but it certainly is highly relevant to the operation of the method!

So how many statements is that altogether? One could make a reasonable case for any number from one to six, depending on whether you count blocks as "real" statements, whether you consider child statements as in addition to their parent statement or not, and so on. There is no standards body which regulates the answer to this question that I'm aware of.

Eric Lippert
I do use Cyclomatic complexity. The reason that I was looking for a code length counter is that complexity counts won't identify methods that are doing way too much but only have 2 or 3 paths.
epotter
+4  A: 

The closest you might get to a formal definition of "what is a statement" would be the C# specification itself. Good luck working out whether a particular tool's measurement agrees with your reading of the specification.

Given that metrics are best used as a guide to better/worse code, and not a strict formula, does the exact definition used by the tool make much difference?

If I have three methods, with "statement lengths" of 2500, 1500 and 150, I know which method I'll be examining first; that another tool might report 2480, 1620 and 174 isn't too important.

One of the best tools I've seen for measuring metrics is NDepend, though again I'm not 100% sure what definitions it is using. According to the website, NDepend has 82 separate metrics, including Number of instructions and Cyclomatic Complexity.

Bevan
+1  A: 

One simple metric is to just count the punctuation marks (;, ,, .) between tokens (so as to avoid those in strings, comments, or numbers). Thus, for (x = 0, y = 1; x < foo.Count; x++, y++) bar[y] = foo[x]; would count as 6.

Gabe
+2  A: 

The C# Metrics Tool defines the things being counted ("statements", "operands"), etc. by using a precise C# BNF language definition. (In fact, it precisely parses the code according a full C# grammar and then computes structural metrics by walking over the parse tree; SLOC count it gets by countline lines as you'd expect).

You might still argue that such a definition it unintuitive (grammars rarely are), but they are precise. I agree with other posters here, however, that the precise measure isn't as important as the relative value that one block of code has with respect to another. A value of "173.92" complexity just isn't very helpful by itself; compard to another complexity value of "81.02", we can say there's a good indication that the first one is more complex than the second, and that's enough to provide a focus of attention.

I think that metrics are also useful in trending; if last week, this code was "81.02" complex, ad this week it is "173.92", I should wonder why is all that happening inthis part of the code?

You might also consider a ratio of a structural metric (e.g., Cyclomatic) to SLOC as an indication of "doing too much", or at least an indication of writing code that is way too dense to understand

Ira Baxter