views:

872

answers:

4

Most established languages have solid test coverage tools available for them, but the depth of functionality differs significantly from one to another.

Also, all the various VMs and compilers have such heterogeneous structure that writing a code coverage tool must be a very different job in C than in Lisp, for example.

  • Python has sys.settrace to tell you directly which lines are executing
  • Clover (for Java) uses its own compiler and adds debug metadata (last time I used it, anyway)
  • Emma (for Java) has a ClassLoader which re-writes bytecode on the fly
  • COVER (for Lisp) has an annotation pass to instrument the code

I'm interested in the implementation of code coverage for different languages:

  1. What are the main approaches to get to C0 coverage, where you can track which lines of code have been executed? I mention native VM introspection and static and dynamic code instrumentation above - are there other methods?

  2. Getting to more enlightened coverage data, like C1 or C2, seems like a language agnostic task compared with C0. Is smacks of big Karnaugh map manipulation to me; are there best practices on how to actually do it? Do more modern logic techniques like fuzziness play a role?

  3. A much overlooked aspect of test coverage is displaying the results back to the programmer, which gets increasingly hard with C1 and C2 data. Frankly, although they get the job done for C0, I'm underwhelmed by most test coverage interfaces; what novel and intuitive interfaces have you seen for coverage data?

A: 

In .NET, the preferred way is to use the .NET Profiling API, which basically offers a bunch of joint points in the CLR itself.

Romain Verdier
+1  A: 

Essentially all code coverage tools instrument the code in order to check which parts of the code were executed.

As defined in the link you provided, C0 and C1 are pretty similar from the point of view of the person writing the instrumentation. The only difference is where you place the code. I'll go further to speculate that C1 is even easier than C0, because the instrumentation happens on the, let's say, abstract syntax level, where line ends do not count very much.

Another reason I'm saying C1 is easier is because it deals with syntactic entities as opposed to lexical entities: how would you instrument:

if
c > 1 && c
< 10
then
blabla
end

Well, just a thought.

As for C2, I have never seen it done in practice. The reason is that you can get an exponential blowup:

if c1 then * else * end
if c2 then * else * end
...
if cn then * else * end

For n lines of code, you would need 2^n tests. Also, what do you do for loops? Typically, you abstract them away as simple if statements (i.e. for each loop you test that its body was executed 0 times for one test and at least once in another test).

I believe sampling the PC is a particularly terrible way to do code coverage because you may miss some statements because they executed too fast :D Same goes for fuzzy logic, which is used to reason about approximations; typically you want your code coverage to b deterministic.

Karnaugh maps are used for minimizing boolean functions, and I do not see any useful link with code coverage tools.

Also, your question is not being very clear at times: do you want techniques to achieve better code coverage or is it just the implementation of code coverage tools that interests you?

Thanks; interesting point about C0/C1. The reason I mention Karnaugh maps is that if you're attempting C2, they seemed the most obvious way of expressing all the possible paths through conditionals - obviously the minterm generation aspect of K-maps is irrelevant here.
Alabaster Codify
A: 

One method that works with virtually every language is to insert instrumentation using a program transformation system.

A technical paper found here: http://www.semdesigns.com/Company/Publications/TestCoverage.pdf explains how this can be done in general.

My company, Semantic Designs offers, a large set of test coverage tools that provide what is called C1 coverage above (e.g, "Branch coverage") so yes it is commonly done), for different languages (C, C++, C#, Java, COBOL, PHP, all in multiple dialects). See www.semdesigns.com/Products/TestCoverage/index.html

Ira Baxter
A: 

Just a small question. What exactly you mean by instrumenting code here. Sorry for newbie question.

Jigar Shah