tags:

views:

99

answers:

3

Had to transfer code (the whole project history) to another dev shop today and was wondering if it's a good idea to zipup the bare git repository our team use for collaboration and just send it verbatim?

Is it safe to do as such?
Is there any sensitive data stored in a .git folder?

+3  A: 

One similar solution would be a git bundle.

See Backup of github repo or Backup a Local Git Repository for more.

the .git (either bundled or compressed) won't contain any sensitive data except all the history and files you put in it.
See git - remove file from the repository for example for removing sensitive data.

As Pat Notz mentions in his answer, a compressed .git will contains your .git/config.
I realize mine contains some remote repo addresses where I actually had to put my login@password for them to work. So you shouldn't include any local metadata (like .git/config) because they are meant to be... local.

VonC
+4  A: 

If you do this instead of using clone or bundle then you'll also be giving them your .git/hooks directory, .git/config file and a few other customizable files. It's not common for those files to contain any sensitive data (you'd know because you would have put it there manually) but they may contain personalized settings. For example, you may have set the user.name and user.email config settings in .git/config. It's possible that you may have written some hook script (in .git/hooks/*) that could contain passwords -- but, like I said, you'd probably know about that already.

But, git does not store any of your passwords or any other secret/sensitive data.

Pat Notz
Right! The `.git/config` file can be sensitive. +1
VonC
+4  A: 

Take a look in your .git directory. There may be lots of files but they fall into a fairly small number of regular groups (object store data, refs, reflogs, etc.). You can separate the contents into two major categories: data that Git might normally transport to other repositories and data that Git would not normally transport to other repositories.

Not normally transported:

  • HEAD, FETCH_HEAD, ORIG_HEAD, MERGE_HEAD
  • config
  • description
  • hooks/
  • index
  • info/ — miscellaneous
  • logs/ — reflogs

Normally transported (e.g. via clones, fetches, pushes, and bundles):

  • objects/
  • packed-refs
  • refs/

This last group makes up the object store and its published entry points. You will obviously have to review the versioned content itself to determine if there is anything sensitive in it.

The HEADs, the index (not present in a bare repository), and the reflogs (logs/) are all additional entry points into the object store. If you have done any history rewriting (e.g. you recently expunged some sensitive configuration file from the recorded history) you will want to pay particular attention to the reflogs (probably not enabled on most bare repositories) and the refs/original/ portion of the refs namespace.

FETCH_HEAD and config might have the addresses of related Git repositories.

config might have other sensitive information.

info/ has various bit of information; some of it could be sensitive (info/alternates); some is less likely to be sensitive (assuming the content itself is “clean”—info/refs, info/packs); some might be important to the operation of the repository (info/grafts). Any add-on tools you were using (hook scripts, web interfaces, etc.) might be storing data in here; some of it may be sensitive (access control lists, etc.).

If there is anything active in hooks/, you will want to evaluate whether it should be provided with the copy of the repository (also check whether it stores any extra data anywhere in the repository).

The description file is probably innocuous, but you might as well check it.


If you are only handing off the content, then you should probably just clone to a fresh bare repository or use a bundle (git bundle as VonC describes).

If you are responsible for handing off the content as well as the process you use to manage it, then you will have to investigate each bit of the repository individually.


In general (or in a more “paranoid” fashion), there are many places inside a Git repository’s hierarchy where someone could store any random file. If you want to be sure you are only giving away data that Git needs, you should use a clone or bundle. If you need to supply some of your “per repository” data (e.g. some hook used to manage the repository) then you should clone to a fresh bare repository (use a file: URL to avoid copying and hardlinking of existing object store files) and reinstall only the hooks/data that are needed to fulfill your obligations.

Chris Johnsen
Nice -- lots of good details.
Pat Notz
Like the gentleman says, lots of details. +1
VonC