views:

113

answers:

4

Is there a way to include git commit hashes inside a file everytime I commit? I can only find out how to do this during archiving but I haven't been able to find out how to do this for every commit.

I'm doing scientific programming with git as revision control, so this kind of functionality would be very helpful for reproducibility reasons (i.e., have the git hash automatically included in all result files and figures).

+1  A: 

Including the commit hash inside files included in the commit would necessarily change the hash. In order to provide repository integrity through the SHA1 hash mechanism, Git doesn't (and cannot) support such a feature.

Greg Hewgill
Is it possible then to have a workaround of this? The goal is just to be able to refer to the code used to generate computed results using the SHA1.
Tim Lin
@Tim Lin: One approach might be to build in the hash a part of your compile process (rather than actually checking it in to Git). See http://stackoverflow.com/questions/1704907/how-can-i-get-my-c-code-to-automatically-print-out-its-git-version-hash for a number of good tips.
Greg Hewgill
A: 

have the git hash automatically included in all result files and figures.

You can pass the hash as an input to the program somehow (e.g. as an environment variable).

This alone doesn't guarantee that you're passing the right hash though.

Maybe you can write a script that checks-out a specific commit (by hash or ref) to a special (or temporary) directory, does an automated build, then runs the program and passes the commit hash as an input to the program.

This way you'll have more confidence that you're getting the right hash.

But still, someone can totally pass any bogus hash and create misleading figures.

hasen j
+1  A: 

Greg explained in his answer why this would be impossible

ident

When the attribute ident is set for a path, git replaces $Id$ in the blob object with $Id:, followed by the 40-character hexadecimal blob object name, followed by a dollar sign $ upon checkout.
Any byte sequence that begins with $Id: and ends with $ in the worktree file is replaced with $Id$ upon check-in.

That means the usual workaround is, through some kind of build process, to include the information you need in a versioned but separate file.
In your case, a file with the list of all other files and their SHA1 value.
Such files might be generated at each commit (amending the commit which just took place) for instance.


As an example of a separate file, Jefromi points out the VERSION file of Git itself, build by this script

elif test -d .git -o -f .git &&
         VN=$(git describe --match "v[0-9]*" --abbrev=4 HEAD 2>/dev/null) &&
         case "$VN" in
         *$LF*) (exit 1) ;;
         v[0-9]*)
                 git update-index -q --refresh
                 test -z "$(git diff-index --name-only HEAD --)" ||
                 VN="$VN-dirty" ;;
         esac
then
VonC
Git itself provides an example of this - if you build from its git repository, it'll include the abbreviated commit has in the version number. Good proof it's (one of the) right way(s) to go. Here's the (tracked) script which generates the version number: http://git.kernel.org/?p=git/git.git;a=blob;f=GIT-VERSION-GEN;h=e45513dee938dde3a8428a833fb43023b04ca95b;hb=HEAD
Jefromi
+2  A: 

You can easily put SHA-1 of file (to be more exact SHA-1 of blob, i.e. SHA-1 of contents of the file) by using $Id$ keywork and ident gitattribute.

If you want to put SHA-1 of commit, there is no out-of-the-box solution, but you can use clean and smudge commands of filter gitattribute. Note that would badly affect performance, as after commit each file would have to be modified to reflect new commit made.


Although as said in other answers to this question, you would do better on embedding version number in generated files when building, like e.g. Linux kernel and git project itself do it.

Jakub Narębski
Good points, I missed the `ident` attribute. +1
VonC