views:

955

answers:

2
+3  Q: 

Git File Integrity

Recently my main machine i use for development started overheating. I started to get 4 5 lockups per day. Everything freezes. All my projects are under version control using git. I remember watching linus's talk at google saying git will ensure that the files are not corrupt. In my situation is it safe to assume that git will warn me if one of the source files gets corrupt.

OS is Mac OS X 10.4 file system is HFS+.

+4  A: 

You can force Git to check the whole repository with git fsck. If a Git repository gets corrupted, you should get a new clone from a non-corrupted repository.

Under normal operation Git should check parts of the repository as they are read, so it might take longer to notice some corruption, but it will be noticed the first time that you try to access the corrupt data.

Esko Luontola
What happens if try to push to remote? Before noticing corruption does it corrupt the remote as well or complain about it?
Hamza Yerlikaya
Pushing to the remote implies packing all the files together, and the receiver (where you're pushing to) has to recalculate the SHA1s of all the files. So if a file was corrupted somehow, the object IDs in the trees would start to mismatch and the corruption would show up--- and you can always roll back to where you were before and do a git fsck to find the problems on your side.
araqnid
Finally, objects files are immutable, so once they are written, they never have their content changed. The only operation that occurs is repacking, so you can't corrupt the remote by pushing as it won't write another copy of a file it already has.
Autocracy
+1  A: 

What Linus meant when he said that Git ensures the files are not corrupted, he was referring to the fact that when you refer to a particular commit (identified by its hash), you are guaranteed that it will always refer to the exact same repository state. If you and pull the linux kernel from Linus' tree, and he refers to some commit ae6bcd1..., there is nothing that you can do (even in your local repository) to ever make commit ae6bcd1... look any different from the commit Linus is looking at when he refers to it.

Furthermore, because a commit object contains references to (all of) its parent commit(s), when you refer to a commit you are guaranteeing its complete history in the DAG as well.

As far as file corruption, its sort of an independent issue; but without corrupting the actual blob objects (ie .git/objects/ob/ject_hashname) if one of your working tree files gets corrupted, you will be able to restore from a previous commit state or from an index/cached state.

You will never be able to corrupt a remote in this case unless you are doing forced pushes (which overwrite history on remotes), since push ensures the commit objects form a continuous history graph.

Matt Enright
So basically as soon as i try to push a corrupt repo it will warn me and i can always clone my repo again and be safe.
Hamza Yerlikaya