tags:

views:

342

answers:

3

I have a recurring problem with my Git repositories. I develop in Windows and my production site is under Linux. Several times it has happened that git was showing all files tracked as modified. I thought this was because of a conf issue or conflict between Windows and Linux, but then this morning, when I checked the Linux repo, it was showing all files as modified.

To add insult to injury, the two Linux repos I use (1 for prod, 1 for test) were showing the same. I had no other choice but to commit all the files, as a hard reset or a checkout were making no changes to the working directory (yup, I pretty much sucks at this). This is the result of the commit:

Created commit #######: Git, you are so mean...
1521 files changed, 302856 insertions(+), 302856 deletions(-)

Any ideas on how to sort this out next time it happens?

+6  A: 

Sounds like a line-ending issue. Check man git-config for core.autocrlf.

Bombe
Is this only important for Windows or is it worth doing on Linux as well?
Steve Folly
I have no idea, I have only used Git under Linux so far and I have never needed to fiddle around with those settings.
Bombe
If you ever receive files originating from windows you might want to consider `core.autocrlf=input` which does a one-way windows->unix conversion if applicable.
Charles Bailey
+8  A: 

As Bombe says, this sounds like a line-ending problem. The simplest discussion of this that I've seen is this guide at Github.

You want to set core.autocrlf on your Windows system so that Git will automatically change line endings to CRLF when you checkout from the repo into the working directory, and conversely change all CRLFs to LF when you commit files to the repo:

$ git config --global core.autocrlf true

Then you can ensure that your repo has consistent line endings either by cloning from your Linux repo or by git reset, which checks out a fresh working copy which applies the new autocrlf setting to all the LFs:

$ git reset --hard HEAD

EDIT -- If Git doesn't recognize a file as binary, autocrlf can corrupt the file by changing what it thinks are CRLF or LFs. If your repo includes unusual binary formatted file types, declare their conversion type explicitly in .gitattributes, as described in the manpage.

Paul
git has fairly comprehensive heuristics for detecting whether a file is binary or not. It is exceedingly rare for it to corrupt audio or image files. .gitattributes is the place to specify that certain files or globs are binary. Putting them in .gitignore will stop git tracking them completely.
Charles Bailey
Thanks, Charles. My bad experience with digital files (CAF files and Illustrator files, I think it was) was last summer -- I'm glad to know it's out-of-date.
Paul
We needed to set autocrlf to FALSE on the windows comp, not true, to get this to work (using msysgit).
Sean Clark Hess
A: 

I think the real issue that you need to address, is how are the files different and is the difference what you are expecting to see?

The traditional default is that git does not alter file contents at all on a git add to the repository. More recent windows git installers enable core.autocrlf which translates unix to windows line endings on a checkout, and the reverse on addition to the repository.

For this reason, if you have more untracked changes that you expect, it is often a good idea to git add all the pending files (e.g. via a git add -u).

At this stage any clean/smudge filters will have be applied and git diff --cached should give a reasonable diff.

If you have staged files that git thinks are different, but the difference is not visible, you may want to have a look at the raw bytes to see if there are any differences in invisible characters.

You might use a tool such as hexdump for this.

Suppose that myfile.txt has differences which are not visible, you might want to try something like this.

# Extract raw versions of the differing files and hexdump to some temporary files
git cat-file blob :myfile.txt | hexdump -C >myfile-stagetmp.bytes
git cat-file blob HEAD:myfile.txt | hexdump -C >myfile-headtmp.bytes

# Diff them. (Yes, you don't have to use git diff!)
git diff --no-index myfile-stagetmp.bytes myfile-headtmp.bytes
Charles Bailey