Source Control - Distributed Systems vs. Non Distributed - What's the difference?

+9 A:

Simply speaking, a centralized VCS (including TFS) system has a central storage and each users gets and commits to this one location.

In distributed VCS, each user has the full repository and can make changes that are then synchronized to other repositories, a server is usually not really necessary.

Lucero 2010-03-18 20:31:16

though you can still have a central repo as well if you want with DVCS

jk 2010-03-19 11:05:51

@jk, true, that's why my wording was "necessary" - it still makes sense to have a central repo for several reasons, such as automated builds, backups etc.

Lucero 2010-03-19 13:16:07

+3 A:

A centralized VCS (CVCS) involves a central server that is interacted with. A distributed VCS (DVCS) doesn't need a centralized server.

DVCS checkouts are complete and self-contained, including repository history. This is not the case with CVCS.

With a CVCS, most activities require interacting with the server. Not so with DVCS, since they are "complete" checkouts, repo history and all.

You need write access to commit to a CVCS; users of DVCS "pull" changes from each other. This leads to more social coding facilitated by the likes of GitHub and BitBucket.

Those are a few relevant items, no doubt there are others.

Grant Palin 2010-03-18 20:41:47

+3 A:

Check out http://hginit.com. Joel wrote a nice tutorial for Mercurial, which is a DVCS. I hadn't done any reading about DVCS before (I've always used SVN) and I found it easy to understand.

Cory Grimster 2010-03-18 20:46:08

http://www.joelonsoftware.com/items/2010/03/17.html is also informative if you are just starting out with DVCS...

Justin Ethier 2010-03-18 21:19:16

+13 A:

The difference is in the publication process:

a CVCS (Centralized) means: to see the work of your colleague, you must wait for them to publish (commit) to the central repository. Then you can update your workspace.
- You are an active producer: if you don't publish anything, nobody sees anything.
- You are a passive consumer: you discover new updates when you refresh your workspace, and have to deal with those changes whether you want it or not.

.

a DVCS means: there is no "one central repository", but every workspace is a repository, and to see the work of your colleague, you can refer to his/her repo and simply pulled its history into your local repo.
- You are a passive producer: anyone can "plug in" into your repo and pull local commits that you did into their own local repo.
- You are an active consumer: any update you are pulling from other repo is not immediately integrated into your active branch unless you explicitly make it so (through merge or rebase).

Version Control System is about mastering the complexity of the changes in data (because of parallel tasks and/or parallel works on one task), and the way you collaborate with others (other tasks and/or other people) is quite different between a CVCS and a DVCS.

TFS (Team Foundation Server) is a project management system which includes a CVCS: Team Foundation Version Control (TFVC), centered around the notion of "work item".
Its centralized aspect enforces a consistency (of other elements than just sources)
See also this VSS to TFS document, which illustrates how it is adapted to a team having access to one referential.
One referential means it is easier to maintain (no synchronization or data refresh to perform), hence the greater number of elements (tasks lists, project plans, issues, and requirements) managed in it.

VonC 2010-03-18 21:00:04

+1 for a more complete answer, including examples which I think the OP is looking for.

Eddie Parker 2010-03-18 21:35:30

Great answer, +1.

Richard Berg 2010-03-19 03:25:46

Good answer, but some CVCS support shelving (TFS, Vault, maybe others), which can be seen as passive producer/active consumer mode as well. Because shelving and unshelving changes by another user is a quite similar workflow to getting someone elses commits in a DVCS.

Lucero 2010-03-19 13:21:24

@Lucero: true, even though I find that less intuitive than accessing a well-defined history of commit.

VonC 2010-03-19 14:33:01

A:

I would recommend reading Martin Fowler's review of Version Control Tools

In short the key difference between CVCS and DVCS is that the former (of which TFS is an example) have one central repository of code and in the latter case, there are multiple repositories and no one is 'by default' the central one - they are all equal.

mfloryan 2010-03-19 10:58:07

A:

The difference is huge.

In distributed systems, each developer works in his own sandbox; he has the freedom to experiment as much as he want, and only push to the "main" repository when his code is ready.

In central systems, everyone works in the same sandbox. This means that if your code is not stable, you can't check it in, because you will break everyone else's code.

If you're working on a feature, it will naturally take a while before it stabilizes, and because you can't afford to commit any unstable code, you would sit on changes until they're stable. This makes development really really slow, specially when you have lots of people working on the project. You just can't add new features easily because you have this stabilization issue where you want the code in the trunk to be stable but you can't!

with distributed systems, because each developer works on his own sandbox, he doesn't need to worry about messing up anyone else's code. And because these systems tend to be really good at merging, you can still have your codebase be up to date with the main repository while still maintaining your changes in your local repository.

hasen j 2010-04-12 06:53:50

@hasen j: "In central systems, everyone works in the same sandbox. This means that if your code is not stable, you can't check it in, because you will break everyone else's code." This is only true if you don't use branches. The point is all the new DVCS handle branching correctly, while SVN/CVS didn't. You can also use centralized systems and have your own private sandboxes, of course!

pablo 2010-04-30 09:35:00

@pablo: even if you use branches, they are *central* branches, meaning you will share it with the rest of your team. if the central system can merge well, you can create a branch for each individual developer, but that would just be a basterdized dvcs.

hasen j 2010-04-30 19:38:58

@hansen j: in fact what I recommend with Plastic SCM is to go even further: not a branch per developer but a branch per task! Yes, every issue you fix from your preferred issue tracking system will be a branch (topic branches, you know). Yes, you share them, but that's not a problem at all, the only problem is having big trouble to merge them back, just that. And having (and sharing) all these branches is not bad at all, it keeps the real evolution of your code, it's very helpful for code review, it's very good to find bugs... You know ;-)

pablo 2010-04-30 22:41:09

@pablo, what do you call test branches? test5000? :P Seriously, if a system can support many branches with merging, on a central repo, it should also support cloning/fetching/pulling, without inventing buzzwords for it. It's like an IDE that doesn't have "undo" in its text editor.

hasen j 2010-05-01 08:06:58

@hansen j: well, using a naming convention for branches is normally a good idea when you've to manage a lot of them, isn't it? I do agree being distributed it's ok, it's great in fact, nothing against it. But that's a different value than branching/merging. This gives you the ability to work disconnected, which is HUGE, but has nothing to do with having your own private sandboxes. That was my point. BTW all the new SCM systems are able to handle branching correctly (finally!) and that's the big point. Ok, they're ALSO distributed, but that's another story.

pablo 2010-05-01 09:02:46

ansaurus

tags:

views:

answers:

Source Control - Distributed Systems vs. Non Distributed - What's the difference?

related questions