views:

128

answers:

4

In a DVCS, each developer has an entire repository on their workstation, to which they can commit all their changes. Then they can merge their repo with someone else's, or clone it, or whatever (as I understand it, I'm not a DVCS user).

To me that flags a side-effect, of being more vulnerable to forgetting to backup. In a traditional centralised system, both you as a developer and the people in charge know that if you commit something, it's held on a central server which can have decent backup solutions in place.

But using a DVCS, it seems you only have to push your work to a server when you feel like sharing it. It's all very well you have the repo locally so you can work on your feature branch for a month without bothering anyone, but it means (I think) that checking in your code to the repo is not enough, you have to remember to do regular pushes to a backed-up server.

It also means, doesn't it, that a team lead can't see all those nice SVN commit emails to keep a rough idea what's going on in the code-base?

Is any of this a real issue?

A: 

Having a local copy of the repository might encourage poor backup habits, if one were slack. However, your master repository SHOULD be backed up.

The "local copy of the entire repository" has a much more important use than being a backup. It reduces the latency of examining the history of the codebase - say, diffing against the latest version - from being a network round trip to a trip to your local hard drive.

That doesn't sound all that big a deal if your main repository's on your gigabit LAN. If you're a telecommuter, and the repository's a good 600+ ms away over a VPN, it makes a world of difference.

I've never looked into it, but I'm sure both Mercurial and Git support post-commit hooks, allowing you to set up commit mails going to the team lead. Then each developer could set up her repository accordingly, or have an interim repository that permits half-baked features with the commit mails, or whatever.

Edit: Regarding John's comment about a long-running experiment being lost because it wasn't ready to commit to the master repo: work in a separate branch and regularly push your changes to the master. You still get all the benefits of working against a local repository (mainly, for me, very low latency), and still not annoy your colleagues with your half-baked feature... and you can still store your changes off your machine, in a place where your admin can properly back up the repository.

Frank Shearar
If you are on a slow remote connection then it's nice for diffs, but I'd hate to download an entire repository that way!
John
Wilka
Waiting 30 seconds just so you can see what changes you just made (CVS - I know SVN allows you to do this locally) is just too irritating to bear. But even in Subversion, if you want to know anything further than your just-made changes (say when trying to find out which commit introduced a bug), you immediately take a latency hit. I find that delays as short as 15 seconds are enough for me to lose interest in waiting, and I end up forgetting my place because I've gone to read my mail or SO.
Frank Shearar
"work in a separate branch and regularly push your changes to the master"... yes but my point is this is an extra step, remembering to push as well as commit. I know I hate every single bit of messing about that isn't coding, for instance.
John
Yes, this separation of "commit" and "push" is pretty much the central difference between distributed and centralised VCSes. And it's what lets you work offline. In the context of a distributed team that can happen quite often: developers are out of the office, someone put a ship through an undersea cable, ... In the case of an extended offline period you'll just have to backup your local repository yourself. You'd have to do that with a centralised VCS too.
Frank Shearar
+1  A: 

I don't think you have an automatism on this. Distributed or centralized VCS can be combined with backup (or not). It's all a question of discipline in the team.

Same for commit-emails. If the team has the discipline to regularly push changes to the right repositories, you can have a working commit-mailinglist too.

Bad habits also can grow in a team with centralized VCS. You have always to fight bad habits.

Mnementh
A: 

In most places I imagine that there is probably still a 'central' repository from where builds are made and put to test. If you want your code in the build, it's got to be pushed centrally.

This is also a management issue - tell your team - push regularly (at least daily) so that your code is backed up. If it's not being done, then get out the big stick.

I'd also note, that if you're relying on looking at the commits to see what your staff are doing, you probably have some larger issues that you might look at addressing...

Paddy
"tell your team - push regularly"... yes but my point is this introduces another thing to do. A big stick is all well and good, but the more there is to remember, the more chances there are to screw up. Like the dev who goes on vacation happy he has checked in code but forgets to push to a central repo.
John
"you probably have some larger issues"... well yes, but especially on remote teams it's a way to keep more 'in touch' as a technical lead, than just talking to developers about what they have done. Not a major thing, jut one little niggle.
John
@John - I think you need to be able to trust your dev team, once appropriately trained. The guy who goes on holiday should be sufficiently embarrassed on his return to mitigate against this happening again. I think the benefits outweigh the potential costs.
Paddy
+1  A: 

I can understand your concern about devs forgetting backups once their local diff is gone (because they've committed locally) and stops nagging them with copious output. I think the solution can lie in better tools, moar tools! You could set up a cron job on each dev's box that pushes every last reachable object in their repository to the central repo, and labels them in the central (backed-up) repo with namespaced branches. I think "git push" can do this, given the correct refspec. Then, all you aren't doing is affecting the state of your public branches.

But do you really need as aggressive a backup process as before, when the repo existed only in one place? With a DVCS, you need a far higher category of catastrophes to lose all your code. You now need an asteroid or a bomb hitting your office (and all your off-site team members), instead of just a hard disk or RAID controller going bad. Note, I'm not advocating sloppiness; I'm advocating equal risk at lower cost.

Bernd Jendrissek
This seems precisely wrong to me....under the old centralized version, the full source tree existed in many places, since the usage model pretty much requires everyone to do regular updates/commits. But in the distrubted model, as John says, there might be weeks-worth of work available only on the developers work PC.
Clyde
Yes, the question is not about having _A_ backed up central copy. It's specifically about work that is committed locally but not pushed - with SVN/CVS you know if you commit it is stored somewhere other than your local PC
John
Under old centralized version control systems, you had *only* the full source tree (and specifically, NOT its history) in many places.My first paragraph specifically addresses the point you seem to imply I'm ignoring.
Bernd Jendrissek