tags:

views:

225

answers:

1

I just read the git-blame manual page once more and noticed this part:

A particularly useful way is to see if an added file has lines created by copy-and-paste from existing files. Sometimes this indicates that the developer was being sloppy and did not refactor the code properly. You can first find the commit that introduced the file with:

git log --diff-filter=A --pretty=short -- foo

and then annotate the change between the commit and its parents, using commit^! notation:

git blame -C -C -f $commit^! -- foo

This sounds quite interesting, but I don't quite grok how it works, and why. I wonder whether it can be used in a git hook to detect copy & pasted code.

Can some git expert maybe explain the effect of using the above git commands together, and whether it's possible to use something like that to make git show whether there's code duplication (maybe by using the 'similarity index' which git seems to computed when renaming files)?

+7  A: 

You can break the commands down individually.

$ git log --diff-filter=A --pretty=short -- foo

displays the log for the file "foo". The --diff-filter option only shows commits in which files were added ("A"), and shows it in a condensed format (the --pretty=short option). (The -- is a standard for saying "nothing that follows is an option", and everything after that is a list of file names on which the log should be applied.)

Then:

$ git blame -C -C -f $commit^! -- foo

git blame annotates each line of a file with information from the last commit. The double -C -C option aggressively checks for lines that were copied from other files. The -f option shows the filename of the original commit (which means if a line was copied from another file, you see the name of the file it was copied from). The $commit^! is notation for a $commit; the ^! suffix means to exclude all of $commit's parents.

So basically, the first command (git log) helps you find commits that introduced copied lines; the second (git blame) helps you find the source for any suspicious commits returned by git log.

mipadi
Thanks for the explanations! The double '-C -C' flag to git blame looks really interesting
Frerich Raabe