views:

56

answers:

3

I am currently working on a senior project on software engineering and implementing a defect prediction mechanism in software projects which use version control system.

Therefore, i want to ask the community about their commit message procedures.

Which words in the commit messages may infer "bug fixed" meaning? So that, i can understand that the modified files in that revision was in a buggy state?

+2  A: 

May I suggest that, rather than brainstorm a list, you set up a Bayesian filter that allows you to flag a bunch of commits as bug fixes or not, and train the system to recognize the characteristics that bug fixes have in common. You are sure to come up with some characteristics that none of us could have predicted.

However, I'll propose a few categories of action words to look for:

One is words that indicate what you did, such as "corrected", "fixed", "tweaked". Another is words that indicate what was wrong, such as "problem", "bug", "issue". Another is words that indicate what your fix accomplished, such as "prevent", "ensure", "stop", "allow".

"Corrected an issue with string sorting. This will prevent strings that start with spaces from incorrectly appearing at the top of the list."

JacobM
Thanks for the advice and clear example.
mkafkas
A: 

Most of my commit messages first contain a reference to the issue in the bug tracker. For example: "Bug 2453 - Fixed memory leak". It can be a bit confusing, because I often saw little distinction between bugs and features, and you end up with commit messages that are really features, but still contain the word "Bug".

As a good issue tracker would allow to differentiate between bugs and features, I suppose it would be possible to reference the issue number back from the commit message to the issue tracker.

If just looking at the commit message, I suppose I would most often use "fixed blah" for bugs, and "implemented blah" for features.

small_duck
mkafkas
A: 

If I may complain about the premise of your question, you're asking for an AI level solution to infer that someone is fixing a bug. Trac, for example, has post-commit-hook that automagically updates bug statuses. The key point is that it does so by requiring the dev to mention which ticket is being addressed. No AI necessary!

All changes in code should probably do one or more of:

  • create a feature
  • fix a bug

(there are other cases, like "enhance readability" which I'd say is fixing a bug!)

To this end, any file touched was probably in a buggy state, or it wouldn't have been altered.

Your spec of "implementing a defect prediction mechanism in software projects which use version control system" sounds very underspecified and imprecise.

Gregg Lind
We are using an AI mechanism for prediction and this is out of scope of this question. Here, i just need to find the filenames of the files which are in the buggy state.Therefore, i am looking at the commit messages. After finding the buggy files, we are using test and train files to run prediction algoritms.
mkafkas
I'm curious about what file changes you think *aren't* part of bugfixes, which is my larger point.
Gregg Lind