views:

268

answers:

3

Most modern version control tools have a command to find a change that introduced a bug by binary search (bisecting) of a history. Such command might be built-in, or it might be provided as extension or plugin. Examples include git-bisect in Git, "hg bisect" in Mercurial (earlier available as hbisect extension), and bzr-bisect plugin for Bazaar.

The challenge is to do it in an automatic or semi-automatic way even in the presence of nonlinear history (branching points and merges). The goal is in general to find "bad" revision in minimal number of steps, or in more detail to find a commit to test which, if possible, divides graph of commits to test (DAG of commits) in half. This problem is solved, I think, very well.

But there is a problem with untestable commits, e.g. if for some revision code doesn't even compile, or if it compiles it doesn't start/run (or finding bug unrelated to the one you are searching). This means that instead of simply marking commit as "good" or "bad", you now have three possible states:

  • good - the bug is not present
  • bad - buggy behavior
  • unknown (untestable) - not known if the bug is present

Some version control systems (SCM) allow you to "skip" such commits, usually going to the parent revision as the one to test next.


Questions are:

  • If you dealt with such situation, meaning you used bisection and stumbled upon non-testable revisions, what is in your experience the distribution of such non-testable commits? Do they occur in isolation (single un-testable commit), or do they appear in ranges (revisions a..b are untestable)? Did you find yourself in a situation where you had to skip commit after commit?

  • Is there some matematical model (like there is for simple bisecting of list / linear history, and even for bisecting arbitrary DAG of revisions) or algorithm (perhaps heuristic), which allow to optimize skipping of untestable commits. The goal is again to minimize (in average) number of versions to tests in the presence of untestable commits (or unrelated bug).

  • Do you use version control system, or some add-on / extension / plugin for revision control system, or some third-party tool which implements such algorithm, beside allowing to simply "skip" untestable commits by going to neighbour revision? What is this VCS or tool? What algorithm does it use (if you know it)?

Hopefully this would lead to even easier (semi)automated finding bugs...


Added 06-06-2009:
When using advanced features of Git, there is one situation where you can have a whole branch of untestable commits (or at least hard to test), namely where you use "subtree" merge to join histories of two separate projects (e.g. full Linux kernel with some driver developed separately using "subtree" merge). This is something to consider when coming up with an algorithm to deal with untestable commits: with nonlinear history there can be a whole branch of untestable commits, and algorithm has to take into account the topology (somewhat).

A: 

You could redefine the individual elements of your bisection/binary search algorithm to be ranges of revisions with adjacent "unknown" states instead of single revisions.

In other words, if during your binary search you find an unknown state, you start doing a sub search forwards and backwards to find the boundaries of the revision range that yield definite answers for you. This would probably be a linear search in both directions, so it'll be a bit slower, you'd have to be assuming that most revisions are not untestable.

This will then ultimately output a range of revisions where the bug appeared, for example somewhere between revision (58,63). You would then have to manually search this range.

Chris Harris
+1  A: 

What's been checked in, obviously, cannot be helped. The large code bases I've worked on required all check-ins to actually build. This was done by having developers submit their change to check-in server which would have a queue of changes waiting to go in. Each change is built for all target platforms in the order it is submitted. If the build fails, the check-in is rejected. If it succeeds, a suite of automated regression / unit tests are run. If any test fails, the check-in is rejected. If it succeeds, the check-in is committed to the repository.

If you have such a system in place, it dramatically reduces the number of unbuildable / untestable revisions. Bad builds are limited to depot administrators doing wacky things outside the check-in server.

In environments where such an option isn't present, I have no hard statistical analysis, but I've found that anecdotally unbuildable revisions occur in pockets. One check-in will mess up a ton of stuff and then there is a series of small check-ins attempting to correct the mess. Then things are generally fine for awhile.

jeffamaphone
So your advice is to _avoid_ introducing untestable commits... good advice, but shit happens (see also comments about subtree merge).
Jakub Narębski
It would be nice to know the distribution of length of such pockets of bad commits.
Jakub Narębski
A: 

Just an idea: I know one of the other issues you're thinking about is testing in the presence of noise. One way to think of skips would be to treat them as randomly responding good/bad, and use a bisection algorithm that is robust to such errors, eg: http://www.disp.uniroma2.it/users/grandoni/FGItcs.pdf "Optimal resilient sorting and searching in the presence of memory faults". I. Finocchi, F. Grandoni, and G. F. Italiano.

The effect of applying that algorithm to git-bisect would be to bisect away from skip points, and re-run the search when it is discovered that the wrong branch has been followed. Unlike in the paper above, you know which points were unreliable, so you could just backtrack.

This looks for me as trying to solve more difficult problem of bisecting in the presence of intermittent failures, and might be overkill. But thanks for the pinter anyway.
Jakub Narębski
I've only scanned that paper, but as far as I can tell it is restricted to the case of a total order, not a partial order, and so would only be useful on a linear history, not a DAG.I have a bisect implementation which works in the presence of noise (http://github.com/Ealdwulf/bbchop/tree/master). It was easy to extend it to the skip case by assuming a probability distribution of skips given known skips, and weight the expected information gain by the probability of not-skipping. But I've only implemented a couple of fairly naive probability distributions, though it is easy to plug in more
Ealdwulf