Note: one of the biggest difference between Git and Mercurial is the explicit presence of the index or staging area.
From Mercurial for Git User:
Git is the only DistributedSCM that exposes the concept of index or staging area. The others may implement and hide it, but in no other case the user is aware nor has to deal with it.
Mercurial's rough equivalent is the DirState
, which controls working copy status information to determine the files to be included in the next commit. But in any case, this file is handled automatically.
Additionally, it is possible to be more selective at commit time either by specifying the files you want to commit on the command line or by using the RecordExtension
.
If you felt uncomfortable dealing with the index, you are switching for the better ;-)
The trick is, you really need to understand the index to exploit fully Git. As this article from May 2006 reminds us then (and it is still true now):
“If you deny the Index, you really deny git itself.”
Now, that article contains many commands which are now simpler to use (so do not rely on its content too much ;) ), but the general idea remains:
You are working on a new feature and starts to make minor modifications on a file.
# working, add a few lines
$ git add myFile
# working, another minor modification
$ git add myFile
At this point, your next commit will embark 2 minor modifications in the current branch
# working, making major modification for the new features
# ... damn! I cannot commit all this in the current branch: nothing would work
$ git commit
Only records the changes added to the staging area (index) at this point, not the major changes currently visible in your working directory.
$ git branch newFeature_Branch
$ git add myFile
The next commit will record all the other major changes in the new branch 'newFrature_Branch'.
Now, adding interactively or even splitting a commit are features available with Mercurial, through the 'hg record
' command or other extensions: you will need to install RecordExtension
, or the CrecordExtension
.
But this is not part of the normal workflow for Mercurial.
Git views a commit as a series of "file content changes", and let you add those changes one at a time.
You should study that feature and its consequences: Most of Git power (like the ability to easily revert a merge (or bisect the problem, or revert a commit), contrary to Mercurial) comes from that "file content" paradigm.
tonfa (in in profile: "Hg dev, pythonist": figures...) chimed in, in the comments:
There's nothing fundamentally "git-ish" in the index, hg could use an index if it was deemed valuable, in fact mq
or shelve
already do part of that.
Oh boy. Here we go again.
First, I am not here to make one tool looks better than another. I find Hg great, very intuitive, with a good support (especially on Windows, my main platform, although I work on Linux and Solaris8 or 10 too).
The index is actually front and center in the way Linus Torvalds works with a VCS:
Git used explicit index updates from day 1, even before it did the first merge. It's simply how I've always worked. I tend to have dirty trees, with some random patch in my tree that I do not want to commit, because it's just a Makefile update for the next version
Now the combination of the index (which is not a notion seen only in Git), and the "content is king" paradigm makes it pretty unique and "git-ish":
git is a content tracker, and a file name has no meaning unless associated to its content. Therefore, the only sane behavior for git add filename is to add the content of the file as well as its name to the index.
Note: the "content", here, is defined as follows:
Git's index is basically very much defined as
- sufficient to contain the total "content" of the tree (and this includes all metadata: the filename, the mode, and the file contents are all parts of the "content", and they are all meaningless on their own!)
- additional "stat" information to allow the obvious and trivial (but hugely important!) filesystem comparison optimizations.
So you really should see the index as being the content.
The content is not the "file name" or the "file content" as separate parts. You really cannot separate the two.
Filenames on their own make no sense (they have to have file content too), and file content on its own is similarly senseless (you have to know how to reach it).
What I'm trying to say is that git fundamentally doesn't allow you to see a filename without its content. The whole notion is insane and not valid. It has no relevance for "reality".
From the FAQ, the main advantages are:
- commit with a fine granularity
- help you to keep an uncommited modification in your tree for a reasonably long time
- perform several small steps for one commit, checking what you did with
git diff
, and validating each small step with git add
or git add -u
.
- allows excellent management of merge conflicts:
git diff --base
, git diff --ours
, git diff --theirs
.
- allows
git commit --amend
to amend only the log message if the index hasn't been modified in the meantime
I personally think this behavior shouldn't be the default, you want people to commit something that is tested or at least compiled
While you are right in general (about the "tested or compiled" part), the way Git allows you for branching and merging (cherry-picking or rebasing) allows you to commit as often as you want in a temporary private branch (pushed only to remote "backup" repository), while re-doing those "ugly commits" on a public branch, with all the right tests in place.