views:

3324

answers:

4

I recently asked about keyword expansion in Git and I'm willing to accept the design not to really support this idea in Git.

For better or worse, the project I'm working on at the moment requires SVN keyword expansion like this

svn propset svn:keywords "Id" expl3.dtx

to keep this string up-to-date:

$Id: expl3.dtx 803 2008-09-11 14:01:58Z will $

But I would quite like to use Git to do my version control. Unfortunately, git-svn doesn't support this, according to the docs:

"We ignore all SVN properties except svn:executable"

But it doesn't seem too tricky to have this keyword stuff emulated by a couple of pre/post commit hooks. But am I the first person to want this? Does anyone have some code to do this?

+1  A: 

You could set the ident attribute on your files, but that would produce strings like

$Id: deadbeefdeadbeefdeadbeefdeadbeefdeadbeef$

where deadbeef... is the sha1 of the blob corresponding to that file. If you really need that keyword expansion, and you need it in the git repo (as opposed to an exported archive), I think you're going to have to go with the ident gitattribute with a custom script that does the expansion for you. The problem with just using a hook is then the file in the working tree wouldn't match the index, and git would think it's been modified.

Kevin Ballard
A: 

Thanks Eridius. I hadn't considered the problem with not matching the index. That would be pretty annoying.

The problem I'm facing is that the actual SVN Id attribute is parsed as a sort-of "version number" in the compiled documentation. So Git's default sha1 isn't really useful at all.

But I'll take a look at the customising how Git's ident is expanded; I hadn't realised this could be done with a custom script — that might well do the trick.


Er, regarding

I think you're going to have to go with the ident gitattribute with a custom script that does the expansion for you

Is this actually possible? The gitattribute man page is not exactly enlightening on this topic.

Will Robertson
+12  A: 

What's going on here: Git is optimized to switch between branches as quickly as possible. In particular, git checkout is designed to not touch any files that are identical in both branches.

Unfortunately, RCS keyword substitution breaks this. For example, using $Date$ would require git checkout to touch every file in the tree when switching branches. For a repository the size of the Linux kernel, this would bring everything to a screeching halt.

In general, your best bet is to tag at least one version:

$ git tag v0.5.whatever

...and then call the following command from your Makefile:

$ git describe --tags
v0.5.15.1-6-g61cde1d

Here, git is telling me that I'm working on an anonymous version 6 commits past v0.5.15.1, with an SHA1 hash beginning with g61cde1d. If you stick the output of this command into a *.h file somewhere, you're in business, and will have no problem linking the released software back to the source code. This is the preferred way of doing things.

If you can't possibly avoid using RCS keywords, you may want to start with this explanation by Lars Hjemli. Basically, $Id$ is pretty easy, and you if you're using git archive, you can also use $Format$.

But, if you absolutely cannot avoid RCS keywords, the following should get you started:

git config filter.rcs-keyword.clean 'perl -pe "s/\\\$Date[^\\\$]*\\\$/\\\$Date\\\$/"'
git config filter.rcs-keyword.smudge 'perl -pe "s/\\\$Date[^\\\$]*\\\$/\\\$Date: `date`\\\$/"'

echo '$Date$' > test.html
echo 'test.html filter=rcs-keyword' >> .gitattributes
git add test.html .gitattributes
git commit -m "Experimental RCS keyword support for git"

rm test.html
git checkout test.html
cat test.html

On my system, I get:

$Date: Tue Sep 16 10:15:02 EDT 2008$

If you have trouble getting the shell escapes in the smudge and clean commands to work, just write your own Perl scripts for expanding and removing RCS keywords, respectively, and use those scripts as your filter.

Note that you really don't want to do this for more files than absolutely necessary, or git will lose most of its speed.

emk
+5  A: 

Unfortunately, RCS keyword substitution breaks this. For example, using $Date$ would require git checkout to touch every file in the tree when switching branches.

That is not true. $Date$ etc. expand to the value which holds at checkin time. That is much more useful anyway. So it doesn't change on other revisions or branches, unless the file is actually re-checked-in. From the RCS manual:

   $Date$ The  date  and  time the revision was checked in.  With -zzone a
          numeric time zone offset is appended;  otherwise,  the  date  is
          UTC.

This also means that the suggested answer above, with the rcs-keyword.smudge filter, is incorrect. It inserts the time/date of the checkout, or whatever it is that causes it to run.

Rhialto