tags:

views:

82

answers:

3

I have a project I cloned over the network to the Mac hard drive (OS X Snow Leopard).

The project is about 1GB in the hard drive

du -s
2073848 .

so when I hg clone proj proj2

then when I

MacBook-Pro ~/development $ du -s proj
2073848 proj

MacBook-Pro ~/development $ du -s proj2
894840  proj2

MacBook-Pro ~/development $ du -s
2397928 .

so the clone seems not so cheap... probably around 400MB... is that so? also, the whole folder grew by about 200MB, which is not the total of proj and proj2 by the way... are there some links and some are not links, that's why the overlapping is not counted twice?

+1  A: 

When possible, Mercurial will use hardlinks on the repository data, it will not use hardlinks on the working directory. Therefore, the only space it can save, is that of the .hg folder.

If you're using an editor that can break hardlinks, you can cp -al REPO REPOCLONE to use hardlinks on the entire directory, including the working directory, but be aware that it has some caveats. Quoting from the manual:

For efficiency, hardlinks are used for cloning whenever the source and destination are on the same filesystem (note this applies only to the repository data, not to the working directory). Some filesystems, such as AFS, implement hardlinking incorrectly, but do not report errors. In these cases, use the --pull option to avoid hardlinking.

In some cases, you can clone repositories and the working directory using full hardlinks with

$ cp -al REPO REPOCLONE

This is the fastest way to clone, but it is not always safe. The operation is not atomic (making sure REPO is not modified during the operation is up to you) and you have to make sure your editor breaks hardlinks (Emacs and most Linux Kernel tools do so). Also, this is not compatible with certain extensions that place their metadata under the .hg directory, such as mq.

Idan K
A: 

When you can get 1TB of disk space for £60, 400MB is cheap (~ 2p).

Vicky
(in USD that's $88 for 1TB, so ~3 cents for 400 MB)
Vicky
a concern here is not the space -- it is the time it takes to create a repository. And making 5000 new copies of files can make the hard drive head work quite hard too.
動靜能量
How long does it take? It can't be all that terribly long. And I really wouldn't worry about hard drive load like that, it may seem like a lot to you, but just like the price, it's trivial.
dimo414
A: 

Cheap is not the same as free. Cloning creates a new repository, that inherently has space costs - if you didn't want it to be located somewhere else on the disk, why would you bother cloning? However it is cheap in comparison, as you note, cloning your 1GB repo only adds ~200MB to the space taken up in the parent directory, because Mercurial is smart enough to identify information that doesn't need to be duplicated.

I think more generally, you need to stop worrying about the intricacies of how Mercurial (or any DVCS/VCS) works. It is a given that using version control takes more disk space, and takes time. As the amount of data and number of changes increases, the space and time demands increase too. What you're failing to realize is that these costs are far outweighed by the benefits of version control. The peace of mind that your work is safe, that you can't accidentally screw anything up, and the ability to look at your past work, along with the ease of distribution in the case of DVCS's are all far more valuable.

If your concerns really outweigh these benefits, you should just stick to a plain file system, and use FTP to share/distribute/commit the source code.

dimo414