I just read Spolsky's last piece about Distributed vs. Non-Distributed version control systems http://www.joelonsoftware.com/items/2010/03/17.html. What's the difference between the two? Our company uses TFS. What camp does this fall in?
views:
432answers:
6Simply speaking, a centralized VCS (including TFS) system has a central storage and each users gets and commits to this one location.
In distributed VCS, each user has the full repository and can make changes that are then synchronized to other repositories, a server is usually not really necessary.
A centralized VCS (CVCS) involves a central server that is interacted with. A distributed VCS (DVCS) doesn't need a centralized server.
DVCS checkouts are complete and self-contained, including repository history. This is not the case with CVCS.
With a CVCS, most activities require interacting with the server. Not so with DVCS, since they are "complete" checkouts, repo history and all.
You need write access to commit to a CVCS; users of DVCS "pull" changes from each other. This leads to more social coding facilitated by the likes of GitHub and BitBucket.
Those are a few relevant items, no doubt there are others.
Check out http://hginit.com. Joel wrote a nice tutorial for Mercurial, which is a DVCS. I hadn't done any reading about DVCS before (I've always used SVN) and I found it easy to understand.
The difference is in the publication process:
- a CVCS (Centralized) means: to see the work of your colleague, you must wait for them to publish (commit) to the central repository. Then you can update your workspace.
- You are an active producer: if you don't publish anything, nobody sees anything.
- You are a passive consumer: you discover new updates when you refresh your workspace, and have to deal with those changes whether you want it or not.
.
- a DVCS means: there is no "one central repository", but every workspace is a repository, and to see the work of your colleague, you can refer to his/her repo and simply pulled its history into your local repo.
- You are a passive producer: anyone can "plug in" into your repo and pull local commits that you did into their own local repo.
- You are an active consumer: any update you are pulling from other repo is not immediately integrated into your active branch unless you explicitly make it so (through merge or rebase).
Version Control System is about mastering the complexity of the changes in data (because of parallel tasks and/or parallel works on one task), and the way you collaborate with others (other tasks and/or other people) is quite different between a CVCS and a DVCS.
TFS (Team Foundation Server) is a project management system which includes a CVCS: Team Foundation Version Control (TFVC), centered around the notion of "work item".
Its centralized aspect enforces a consistency (of other elements than just sources)
See also this VSS to TFS document, which illustrates how it is adapted to a team having access to one referential.
One referential means it is easier to maintain (no synchronization or data refresh to perform), hence the greater number of elements (tasks lists, project plans, issues, and requirements) managed in it.
I would recommend reading Martin Fowler's review of Version Control Tools
In short the key difference between CVCS and DVCS is that the former (of which TFS is an example) have one central repository of code and in the latter case, there are multiple repositories and no one is 'by default' the central one - they are all equal.
The difference is huge.
In distributed systems, each developer works in his own sandbox; he has the freedom to experiment as much as he want, and only push to the "main" repository when his code is ready.
In central systems, everyone works in the same sandbox. This means that if your code is not stable, you can't check it in, because you will break everyone else's code.
If you're working on a feature, it will naturally take a while before it stabilizes, and because you can't afford to commit any unstable code, you would sit on changes until they're stable. This makes development really really slow, specially when you have lots of people working on the project. You just can't add new features easily because you have this stabilization issue where you want the code in the trunk to be stable but you can't!
with distributed systems, because each developer works on his own sandbox, he doesn't need to worry about messing up anyone else's code. And because these systems tend to be really good at merging, you can still have your codebase be up to date with the main repository while still maintaining your changes in your local repository.