views:

177

answers:

4

"Programming Pearls" in the column 2 ("AHA! Algorithm") talks about how binary search aids in various processes like sorting, tree traversals. But it mentions that binary seach can be used in "program debugging". Can somebody please explain how this is done?

+3  A: 

If you don't know which line in a 100 line program is buggy, you'd try to run the first 50 lines and skip the rest. If the problem show up, you'd know this first segment contains the bug. You'd next try to split this and run the first 25 lines and see if the problem is there and so on until you have figured out a short enough piece to look at.

The idea behind binary-search is to identify/isolate a small region which is buggy. However, as with all methods, this is not applicable in every situation. E.g: a recursive function will be terribly unwieldy to such a tool. When there are far too many execution paths, segmenting your code to be run may become difficult.

dirkgently
oh so binary search here doesnt mean you are searching for elements but simply dividing the program and looking for a problem. thanks.
kunjaan
Exactly! You apply the algorithm manually.
dirkgently
A: 

Let's say you have a bug, but you don't know where it is. You could place break points randomly or single step through the code, verifying the data at each stop. A better strategy, however, would be to pick a spot in the middle of the code block that you're looking at. If the problem exists there, then pick a spot mid-way between the start and the current spot and try it again. If the problem doesn't exist, then pick a spot midway between the current spot and the end, and try again. Keep going this way until you've narrowed the amount of code down to a block big enough to single step through more efficiently than stopping/restarting. That's basically doing binary search on your code.

tvanfosson
+2  A: 

Binary search may help debugging in the following ways:

  1. Suppose control has to reach a certain point and you suspect it doesn't. Put print statements in the first and last code lines. Suppose you see the result of the first but not second statement. Put a print statement in the middle and try again. This way you use binary search over the space of lines of code to zero in on the bug.
  2. Suppose you use a version control system. Version 10 passed all tests. Version 70, about to be released, fails some tests. Check out version 40 and run the tests on it. If it works fine, try version 55. If version 40 fails, try version 25. This way you use binary search over program version space in order to zero in on the first version where a bug entered the program.
Yuval F
+3  A: 

Another possibility is that you have a bug, and you know it wasn't there in your February release, but it was in your April release (or rather, your April release candidate -- you would never actually ship a bug to your users, right?).

You can do a manual binary search through your revision-control history to narrow down when the bug was introduced. First check out the code halfway between the two releases, build it, and see if the bug is there. Keep partitioning until you find out when it was introduced. If you don't know where to start looking for the bug, this can be very effective, especially if you do fairly small commits.

This works very well with Subversion because it has repository-wide revision numbers. If your February release was rev 533 and your April release was rev 701, then you update to rev 617, test it, and go from there. (Actually, I usually round to 600 so I don't have to do so much math in my head.) Once I start to narrow it down, I start looking at the commit comments and making educated guesses ("I really don't think this commit would have broken it"), so I usually don't need to do all log2(n) checkouts.

I've never used Git, but they take this one step further with the built-in "bisect" command. You give it a starting point (when was it known to work?) and ending point (when did you notice it was broken?), and it will automatically get the code for the halfway point in the binary search. Then after you've built and tested, you tell it whether this rev passed or failed; then it gets the code for the next halfway point. You can even tell it to run a command for each rev and use the command's exit code to determine whether the rev is a pass or fail, at which point it can run on full automatic.

Joe White