How do the decentralized behaviour of Distributed Revision Control Systems Work?

I'm partial to Git, but I believe the theory applies to most other systems...

Decentralized VCS is designed to handle branching and merging as part of its DNA by keeping a pointer to the previous commit in every commit, so any change can be traced back to a common ancestor.

Revision "numbers" as such aren't used to refer to commits. Obviously, there would be more than one sequence if that were the case... In the case of Git, the pointer "key" which uniquely identifies any commit is an SHA1 hash. The only thing that makes the whole arrangement sequential is the graph of pointers referencing each commit's parent.

In practice, a developer commits their work to their own local copy, and when it's time to share it with others, they do so in three ways:

Ask the other developer to pull the changes directly from them
Push directly into the other developer's copy
Push the changes to a central location that others can pull from

These are really the same thing in the end because it just comes down to merging the diffs. In the third scenario the central location merely acts as a proxy--the same thing can be achieved without it.

The system can be as centralized or decentralized as you choose to make it. Most projects have some amount of centralization for practical reasons, but at any point a fork can become the new central repository, or developers can choose to trade code between themselves ad-hoc.

When commits are fetched and merged into your own copy, these are applied on top of whatever common ancestor you share with the upstream repository. If there is a conflict, the merge process pauses at the commit step where the conflict occurred, and asks you to resolve it before continuing to apply the remaining commits on top of it. (Standard unified diff markers are used to mark the conflicts.)

Most merges happen automatically, but when there's a conflict it's often quite trivial to resolve. The nice thing is that you don't end up with a ball of conflicts spanning several commits: it's much easier to resolve because it pauses in the middle of the history and lets you deal with it in smaller, logical chunks.

ansaurus

tags:

views:

answers:

How do the decentralized behaviour of Distributed Revision Control Systems Work?

related questions