views:

310

answers:

2

Currently, one of our production systems is handled by over 3000 programs written between 1986 and now . The code base is written in a non-standard language, which unfortunately lacks modern testing tools.

In a bid to improve our code quality moving forward I have been working to incorporate processes and build tools that will improve development and testing. I have just completely a line coverage tool so that we can help identify dead code + untested code during development.

Now, I would like to start work on adding Path Coverage to the tool.

How should I go about this?

Given that:

1) The line coverage tool acts as a pre-processor that injects code
2) I already have the ability to gather stats I set in said code.

What data should I be recording as the program executes, and how do I interpret it?

How can I represent the results via HTML?

I have already read the question How to get started “writing” a code coverage tool?, which was about Java, however it didn't help (including the paper "Branch Coverage for Arbitrary Languages Made Easy").

Thanks in advance for any guidance given!

+1  A: 

Path coverage measurement is a tricky subject. First you have to define what you mean by a path in the first place. Is a loop executed three times a different path than a loop executed four times? If so, you have an infinite number of paths. If not, there are test cases missed even if all paths are covered.

It might be that the better next step is branch coverage: measuring whether every conditional is executed both true and false. This can be accomplished by recording sequences of line numbers.

Ned Batchelder
+1  A: 

To do path coverage, you need to get at the program control flow somehow. An obvious method is to construct a real control flow graph, and then traverse segments of it to pick out "path fragments" (e.g., basis paths) to be used in your path coverage analysis. (You can attempt to do this by string hacking the source code, but you'll likely fail; parsing and flow analysis is too complex).

See http://stackoverflow.com/questions/703570/whats-the-point-of-basis-path-coverage for a good stackoverflow discussion on basis paths.

To implement the required path coverage tool, you likely need to really parse the full legacy language completely. For 3000 programs and a strong requirement for testing, using a industrial strength parser and infrastructure to do this would make sense.

Our DMS Software Reengineering Toolkit can be used to construct not only the parser, but the control flow analysis and the instrumentation required to collect path coverage data. (The "Branch Coverage for Arbitrary Languages" made this point if all you wanted to do was collect branch coverage data, but there's more to DMS than just parsing). DMS also has support for constucting control (and data flow) graphs if you need them, as you apparantly do in this case; see DMS constructed control flow graphs.

DMS has been used to build control and data flow analyzers for C, Java and COBOL, and has been used to build parsers for some 30+ langauges. It can handle your legacy language if you are serious about this.

Ira Baxter