tags:

views:

474

answers:

3

Coming from CVS, we have a policy that commit messages should be tagged with a bug number (simple suffix "... [9999]"). A CVS script checks this during commits and rejects the commit if the message does not conform.

The git hook commit-msg does this on the developer side but we find it helpful to have automated systems check and remind us of this.

During a git push, commit-msg isn't run. Is there another hook during push that could check commit messages?

How do we verify commit messages during a git push?

+5  A: 

Using the update hook

You know about hooks - please, read the documentation about them! The hook you probably want is update, which is run once per ref. (The pre-receive hook is run once for the entire push) There are tons and tons of questions and answers about these hooks already on SO; depending on what you want to do, you can probably find guidance about how to write the hook if you need it.

To emphasize that this really is possible, a quote from the docs:

This hook can be used to prevent forced update on certain refs by making sure that the object name is a commit object that is a descendant of the commit object named by the old object name. That is, to enforce a "fast-forward only" policy.

It could also be used to log the old..new status.

And the specifics:

The hook executes once for each ref to be updated, and takes three parameters:

  • the name of the ref being updated,
  • the old object name stored in the ref,
  • and the new objectname to be stored in the ref.

So, for example, if you want to make sure that none of the commit subjects are longer than 80 characters, a very rudimentary implementation would be:

#!/bin/bash
long_subject=$(git log --pretty=%s $2..$3 | egrep -m 1 '.{81}')
if [ -n "$long_subject" ]; then
    echo "error: commit subject over 80 characters:"
    echo "    $long_subject"
    exit 1
fi

Of course, that's a toy example; in the general case, you'd use a log output containing the full commit message, split it up per-commit, and call your verification code on each individual commit message.

Why you want the update hook

This has been discussed/clarified in the comments; here's a summary.

The update hook runs once per ref. A ref is a pointer to an object; in this case, we're talking about branches and tags, and generally just branches (people don't push tags often, since they're usually just for marking versions).

Now, if a user is pushing updates to two branches, master and experimental:

o - o - o (origin/master) - o - X - o - o (master)
 \
  o - o (origin/experimental) - o - o (experimental)

Suppose that X is the "bad" commit, i.e. the one which would fail the commit-msg hook. Clearly we don't want to accept the push to master. So, the update hook rejects that. But there's nothing wrong with the commits on experimental! The update hook accepts that one. Therefore, origin/master stays unchanged, but origin/experimental gets updated:

o - o - o (origin/master) - o - X - o - o (master)
 \
  o - o - o - o (origin/experimental, experimental)

The pre-receive hook runs only once, just before beginning to update refs (before the first time the update hook is run). If you used it, you'd have to cause the whole push to fail, thus saying that because there was a bad commit message on master, you somehow no longer trust that the commits on experimental are good even though their messages are fine!

Jefromi
I think the hook the OP is looking for is pre-receive, since s/he wants to reject the entire push depending on the commit message.However, AFAIK, neither pre-receive nor update receive the commit message as input. So using commit-msg will probably be the best solution.
Can Berk Güder
@Can: I'm pretty sure the OP wants update, not pre-receive. "The whole push" means the push for all branches. If the user attempts to push updates to three branches, and only one contains invalid commit messages, the other two should still be accepted!
Jefromi
@Can: And no, the commit message is not part of the input, but the old and new object (commit) names (SHA1s) are. Note that the update hook is executed just before the refs are updated (after the commit objects have been received). The hook can therefore use git log to inspect whatever it wants to about the commits between old and new, including their commit messages.
Jefromi
@Jefromi » I'm not sure I agree, but I think this part is subjective. IMO I'd treat it as a transaction: if any part of something you did is bad, stop the whole thing so you can correct the mistakes.
John Feminella
@John: That would be the most straightforward and desirable. The whole thing should fail if any one part is invalid.
Darvan Shovas
@John: Well, you can make your own judgment call. Here's my general thought, though. It's consistent with the general philosophy of branches in git to treat each one as a transaction. You do stop the push of that individual branch if it has one bad commit, even if it has 500 new commits on it. But two different branches are two different things - different topics, different features. If you work on two things and make a mistake on one, it shouldn't affect the other.
Jefromi
@shovas: But what does an invalid commit message X on branch A have to do with branch B? B can't contain X, or it'd fail the hook too. So B is a series of commits, all of which are perfectly fine. Why should pushing B be refused simply because the developer also did something wrong on A? If they pushed branches individually (git push A, git push B) A would fail and B would succeed.
Jefromi
@shovas: I do agree that push of "the whole" should fail if "one part" is invalid, but the proper definitions in this context are that "the whole" is the branch, and the one part is a single commit. Content on one branch *cannot* invalidate content on another branch.
Jefromi
@Jefromi: You make a good point. I guess my uncertainty is if is the "once per ref" idea in the docs. Does that mean all commits on a branch are one ref?
Darvan Shovas
@shovas: A ref is a pointer to an object, generally commit. Examples of refs are tags and branches. So when the docs say once per ref updated in a push, they mean once per branch/tag. If you've made 50 commits on a branch since you pushed, git uploads all 50, then updates the ref. The old position of the ref is just before the first commit pushed; the new is the last commit pushed. The hook runs just before updating the ref; you can examine all 50 commits. If the hook fails, the ref won't move at all.
Jefromi
A: 

You need made a script on your pre-receive.

In this script you receive the old and new revision. You can check all commit and return false if one of this is bad.

shingara
I ended up using the pre-receive hook. Thanks!
Darvan Shovas
+1  A: 

You could do it with the following pre-receive hook. As the other answers have noted, this is a conservative, all-or-nothing approach. Note that it protects only the master branch and places no constraints on commit messages on topic branches.

#! /usr/bin/perl

my $errors = 0;
while (<>) {
  chomp;
  next unless my($old,$new) =
    m[ ^ ([0-9a-f]+) \s+   # old SHA-1
         ([0-9a-f]+) \s+   # new SHA-1
         refs/heads/master # ref
       \s* $ ]x;

  chomp(my @commits = `git rev-list $old..$new`);
  if ($?) {
    warn "git rev-list $old..$new failed\n";
    ++$errors, next;
  }

  foreach my $sha1 (@commits) {
    my $msg = `git cat-file commit $sha1`;
    if ($?) {
      warn "git cat-file commit $sha1 failed";
      ++$errors, next;
    }

    $msg =~ s/\A.+? ^$ \s+//smx;
    unless ($msg =~ /\[\d+\]/) {
      warn "No bug number in $sha1:\n\n" . $msg . "\n";
      ++$errors, next;
    }
  }
}

exit $errors == 0 ? 0 : 1;

It requires all commits in a push to have a bug number somewhere in their respective commit messages, not just the tip. For example:

$ git log --pretty=oneline origin/master..HEAD
354d783efd7b99ad8666db45d33e30930e4c8bb7 second [123]
aeb73d00456fc73f5e33129fb0dcb16718536489 no bug number

$ git push origin master
Counting objects: 6, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (5/5), 489 bytes, done.
Total 5 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (5/5), done.
No bug number in aeb73d00456fc73f5e33129fb0dcb16718536489:

no bug number

To file:///tmp/bare.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'file:///tmp/bare.git'

Say we fix the problem by squashing the two commits together and pushing the result:

$ git rebase -i origin/master
[...]

$ git log --pretty=oneline origin/master..HEAD
74980036dbac95c97f5c6bfd64a1faa4c01dd754 second [123]

$ git push origin master
Counting objects: 4, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 279 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
To file:///tmp/bare.git
   8388e88..7498003  master -> master
Greg Bacon
Many thanks for the example scripts. I went the way of the pre-receive hook.
Darvan Shovas
You're welcome! I'm glad it helped.
Greg Bacon