views:

115

answers:

2

I am currently working with a subversion repository but I am using git to work locally on my machine. It makes work much easier, but it also makes some of the bad behavior going on in the subversion repo quite glaring and that creates problems for me.

There is a somewhat complex local build process after pulling down the code and it creates (and unfortunately modifies) a number of files. Obviously these changes are not meant to be committed back to the repository. Unfortunately the build process is actually modifying some tracked files (yes, most likely because someone mistakenly committed these build artifacts at some point to the subversion repository). Since these are modifications adding them to my ignore file does nothing for me.

I can avoid checking these changes back it, I simple don't stage or commit them, but having unstaged local changes means I can't rebase without first cleaning them up.

What I would like to know is if there any way to ignore future changes to a set of tracked files? Alternatively, is there another way to handle the problem I am having, or will I just have to tell whoever checked in these files to clean them up?

+1  A: 

Unless there's some serious political brain damage going on, removing the artifacts from source control is the correct step. (Or rather, "most expedient" step, it's always the correct step.)

I not aware of a way to tell git to ignore changes to tracked files.

Nathan Kidd
+2  A: 

As Nathan said, cleaning up those files (un-tracking them) is the smart move.

But if you must ignore tracked files (which is not the native Git way when it comes to ignoring files: Git only ignores non-tracked files), you can setup a process copying the content of files you want to ignore, and restoring on commit.

I initially believed that a smudge/clean process, that is a gitattributes filter driver could do the trick:

alt text

, where:

  • the smudge process will make a copy of those files (when updating the working tree)
  • some modifications take place during the build
  • the clean step (during commit) will erase the files content with the copy made in step 1.

BUT, as stated in this post, that would mean abusing this stateless file content transformation by adding a stateful context (i.e. the full path name of the file being smudged/clean).
And that is explicitly forbidden by J.C. Hamano:

Although I initially considered interpolating "%P" with pathname, I ended up deciding against it, to discourage people from abusing the filter for stateful conversion that changes the results depending on time, pathname, commit, branch and stuff.

and even Linus Torvalds had some reservations at the time about the all mechanism:

I have to say, I'm obviously not a huge fan of playing games, but the diffs are very clean.

Are they actually useful? I dunno. I'm a bit nervous about what this means for any actual user of the feature, but I have to admit to being charmed by a clean implementation.

I suspect that this gets some complaining off our back, but I also suspect that people will actually end up really screwing themselves with something like this and then blaming us and causing a huge pain down the line when we've supported this and people want "extended semantics" that are no longer clean.

But I'm not sure how valid an argument that really is. I do happen to believe in the "give them rope" philosophy. I think you can probably screw yourself royally with this, but hey, anybody who does that only has himself to blame


So the right place to add some kind of save/restore mechanism (and effectively ignoring any changes to a set of tracked files in Git) would be in hooks:

  • post-checkout: invoked when a git checkout is run after having updated the worktree. There you can run a script collecting all the files to ignore and saving them somewhere.

  • pre-commit: you can run a second script which will restore the content of those files, before obtaining the proposed commit log message and making a commit.

VonC
I agree, I just wanted a backup plan in case that wasn't going to happen. I have to work with the repository, but the files are not mine and I am using git to work with it.
Chris Nicola
I can see how I would copy the files, though I'm not sure how to reference the file name in the clean filter. Is there a variable for that?
Chris Nicola
@Chris: after reading http://lists.zerezo.com/git/msg422632.html, I am not so sure that this filter driver is the right tool, since J.C. Hamano wanted it stateless.
VonC
Yeah that is the way it seems to me too. I suppose a sophisticated enough parser for the blob might work, but that's too much work. I'll just have to make them clean it up ;P.
Chris Nicola
@Chris: I have updated my answer with the "right" solution
VonC
Nice!! very well written and researched answer .
Surya
Yes thanks that is awesome. Again I want to stress that I do agree with Nathan as well about having the team clean SVN, but the reality is often users of SVN don't follow the same "clean" habits as git users. So having a fall-back solution like this is still very useful to a git user working with a large shared SVN repository.
Chris Nicola