views:

55

answers:

2

We have noticed that even though we have a lot of doctests in our Python code, when we trace the testing using the methods described here:

traceit

we find that there are certain lines of code that are never executed. We currently sift through the traceit logs to identify blocks of code that are never run, and then try to come up with different test cases to exercise these particular blocks. As you can imagine, this is very time-consuming and I was wondering if we are going about this the wrong way and whether you all have other advice or suggestions to deal with this problem, which I'm sure must be common as software becomes sufficiently complex.

Any advice appreciated.

+5  A: 

coverage.py is a very handy tool. Among other things, it provides branch coverage.

Hank Gay
+3  A: 

Do you have a mandate from management to be dogmatic about obtaining 100% code coverage with your test cases? If not, do you believe touching every line of code is the most effective way to find bugs in your code? Assuming you don't have infinite time and people resources, you should probably focus on reasonably testing all of your non trivial code with emphasis on the parts that the developers know were tricky to write or error prone.

Although code coverage is great because you surely can't say a piece of code is tested until it has been touched, I just don't equate touching a piece of code to calling it tested. I'm not against code coverage, but it's too easy to fall into using code coverage as the metric to know when testing is complete. I think that would be a mistake.

Brad Barker
This is an excellent comment. In my situation, we have scientists, not programmers, writing the Python code. As a result, even though the scientists are very smart, the code is very poorly architected. This means final integration and testing is a nightmare, and we have to work too hard to uncover serious problems during this phase. I'm trying to get them to write better test cases for the code each is responsible for and I'm planning to use code coverage as a way of qualifying the testing they integrate. I can understand that not 100% of the code needs to be touched, but it would help.
reckoner