tags:

views:

1496

answers:

8

I use git for personal projects and think it's great. It's fast, flexible, powerful, and works great for remote development.

But now it's mandated at work and, frankly, we're having problems.

Out of the box, git doesn't seem to work well for centralized development in a large (20+ developer) organization with developers of varying abilities and levels of git sophistication - especially compared with other source-control systems like Perforce or Subversion, which are aimed at that kind of environment. (Yes, I know, Linus never intended it for that.)

But - for political reasons - we're stuck with git, even if it sucks for what we're trying to do with it.

Here are some of the things we're seeing:

  • The GUI tools aren't mature
  • Using the command line tools, it's far to easy to screw up a merge and obliterate someone else's changes
  • It doesn't offer per-user repository permissions beyond global read-only or read-write privileges
  • If you have a permission to ANY part of a repository, you can do that same thing to EVERY part of the repository, so you can't do something like make a small-group tracking branch on the central server that other people can't mess with.
  • Workflows other than "anything goes" or "benevolent dictator" are hard to encourage, let alone enforce
  • It's not clear whether it's better to use a single big repository (which lets everybody mess with everything) or lots of per-component repositories (which make for headaches trying to synchronize versions).
  • With multiple repositories, it's also not clear how to replicate all the sources someone else has by pulling from the central repository, or to do something like get everything as of 4:30 yesterday afternoon.

However, I've heard that people are using git successfully in large development organizations.

If you're in that situation - or if you generally have tools, tips and tricks for making it easier and more productive to use git in a large organization where some folks are not command line fans - I'd love to hear what you have to suggest.

BTW, I've asked a version of this question already on LinkedIn, and got no real answers but lots of "gosh, I'd love to know that too!"

UPDATE: Let me clarify...

Where I work, we can't use ANYTHING other than git. It's not an option. We're stuck with it. We can't use mercurial, svn, bitkeeper, Visual Source Safe, ClearCase, PVCS, SCCS, RCS, bazaar, Darcs, monotone, Perforce, Fossil, AccuRev, CVS, or even Apple's good ol' Projector that I used in 1987. So while you're welcome to discuss other options, you ain't gonna get the bounty if you don't discuss git.

Also, I'm looking for practical tips on how to use git in the enterprise. I put a whole laundry list of problems we're having at the top of this question. Again, people are welcome to discuss theory, but if you want to earn the bounty, give me solutions.

+13  A: 

Yes, I know, Linus never intended it for that.

Actually, Linus argues that centralized systems just can't work.

And, what's wrong with the dictator-and-lieutenants workflow?

diagram

Remember, git is a distributed system; don't try to use it like a central one.

(updated)

Most of your problems will go away if you don't try to use git as if it was "svn on steroids" (because it's not).

Instead of using a bare repository as a central server where everyone can push to (and potentially screw up), setup a few integration managers that handle merges, so that only they can push to the bare repository.

Usually these people should be the team leads: each leader integrates his own team's work and pushes it to the blessed repository.

Even better, someone else (i.e. dictator) pulls from the team leaders and integrates their changes into the blessed repository.

There's nothing wrong with that workflow, but we're an overworked startup and need our tools to substitute for human time and attention; nobody has bandwidth to even do code reviews, let alone be benevolent dictator.

If the integrators don't have time to review code, that's fine, but you still need to have people that integrate the merges from everybody.

Doing git pulls doesn't take all that much time.

git pull A
git pull B
git pull C

git does substitute for human time and attention; that's why it was written in the first place.

  • The GUI tools aren't mature

The gui tools can handle the basic stuff pretty well.

Advanced operations require a coder/nerdy mindset (e.g. I'm comfortable working from the command line). It takes a bit of time to grasp the concepts, but it's not that hard.

  • Using the command line tools, it's far to easy to screw up a merge and obliterate someone else's changes

This won't be a problem unless you have many incompetent developers with full write access to the "central repository".

But, if you set up your workflow so that only a few people (integrators) write to the "blessed" repository, that won't be a problem.

Git doesn't make it easy to screw up merges.

When there are merge conflicts, git will clearly mark the conflicting lines so you know which changes are yours and which are not.

It's also easy to obliterate other people's code with svn or any other (non-dsitributed) tool. In fact, it's way easier with these other tools because you tend to "sit on changes" for a long time and at some point the merges can get horribly difficult.

And because these tools don't know how to merge, you end up always having to merge things manually. For example, as soon as someone makes a commit to a file you're editing locally, it will be marked as a conflict that needs to be manually resolved; now that is a maintenance nightmare.

With git, most of the time there won't be any merge conflicts because git can actually merge. In the case where a conflict does occur, git will clearly mark the lines for you so you know exactly which changes are yours and which changes are from other people.

If someone obliterates other people's changes while resolving a merge conflict, it won't be by mistake: it will either be because it was necessary for the conflict resolution, or because they don't know what they're doing.

  • It doesn't offer per-user repository permissions beyond global read-only or read-write privileges

  • If you have a permission to ANY part of a repository, you can do that same thing to EVERY part of the repository, so you can't do something like make a small-group tracking branch on the central server that other people can't mess with.

  • Workflows other than "anything goes" or "benevolent dictator" are hard to encourage, let alone enforce

These problems will go away when you stop trying to use git as if it was a centralized system.

  • It's not clear whether it's better to use a single big repository (which lets everybody mess with everything) or lots of per-component repositories (which make for headaches trying to synchronize versions).

Judgment call.

What kind of projects do you have?

For example: does version x.y of project A depend on exactly version w.z of project B such that every time you check x.y of project A you also have to checkout w.z of project B, otherwise it won't build? If so I'd put both project A and project B in the same repository, since they're obviously two parts of a single project.

The best practice here is to use your brain

  • With multiple repositories, it's also not clear how to replicate all the sources someone else has by pulling from the central repository, or to do something like get everything as of 4:30 yesterday afternoon.

I'm not sure what you mean.

hasen j
There's nothing wrong with that workflow, but we're an overworked startup and need our tools to substitute for human time and attention; nobody has bandwidth to even do code reviews, let alone be benevolent dictator. Anybody who has write permissions can - and does - accidentally push crap into the central repository. You can certainly push crap with other source control systems, but I find that, compared with git, other systems I've used make it easier to do merges and avoid the crap, and to back up to before crap someone else has pushed.
Bob Murphy
And as far as the politics, if you're a peaceable person, I doubt you'd like it. As I said, we're an overworked startup with a lot of very smart, opinionated, young, energetic, inexperienced, immature developers. We surf a giant wave of barely controlled chaos, and among the things that are flying under management's radar is the CM system. As a 51-year-old, I survive by being a badass who can out-fight (martial arts), out-drink (single malt Scotch) and out-code any of the twenty-somethings. And I feel a responsibility to protect those among my colleagues who are suffering but not aggressive...
Bob Murphy
... Anyway, we've got git because some of the early hires put it in place. Their attitude is, "I love git, and if it doesn't work for you, too bad, suck it up." And they're aggressive enough about it, other people have given up. So I'm trying to figure out if we can make it work for the people who are having problems, or if I'm going to make an issue of it. But I hate having to do this sort of "who's willing to go to the mat hardest" nonsense, and would rather just make rational decisions based on organizational needs, then go home, put on my slippers, and hang with my wife.
Bob Murphy
in that case, the integrators don't have to review the code itself, just make sure the repository is kept in a consistent state.and, I'm afraid I *am* a twenty something opinionated, energetic, immature developer :)
hasen j
Keep on keepin' on, brother! I hope your path to humility is less painful than mine was. :-)
Bob Murphy
Well, I only started using linux, git, vim (etc) after I was having so much pain trying to manage my little project on windows. It was almost impossible, I have no idea how I survived before git. No other way to develop software makes sense to me anymore.
hasen j
Bob... you sound like a vary humble person. I can tell you what, I wouldn't want to work with someone that is outwardly telling people that they: are a badass, can kick anyone's ass, is smarter than everyone, and drinks more than anyone. I think you sound like a fool, I could be wrong, but thats a pretty crappy attitude to have towards younger developers like myself.
Joseph Silvashy
Joseph, I would be the first to agree with you that I act like a strutting buffoon, and regret the necessity. Unfortunately, I joined this startup when it was pretty disorganized, and saw early on that "nice" people got bulldozed - hence the badass. But we've added some new managers, and things are settling down. My real nature is kind of a quiet academic who, among other things, studies martial arts and enjoys a single malt once in a while. I'm finding it quite pleasant to turn down the volume on those parts of my personality; they were exaggerated to ludicrous levels.
Bob Murphy
Oh - I don't actually go around the office swigging from a bottle of hooch and offering fistfights to all comers. That was a joking metaphorical allusion to the legend of Mike Fink - check him out on Wikipedia. Although I have been known to show up at the office somewhat worse for the wear after going to the dojo and having my own ass kicked by Mrs. Kelly, our local children's librarian who has a black belt.
Bob Murphy
I assume that your local library does not have significant problems with late returns? Anyway, I wish you the best of luck, having been on the "bulldozed" end of that kind of environment a few times.
Mike DeSimone
+19  A: 

I'm the SCM engineer for a reasonably large development organization, and we converted to git from svn over the last year or so. We use it in a centralized fashion.

We use gitosis to host the repositories. We broke our monolithic svn repositories up into many smaller git repositories as git's branching unit is basically the repository. (There are ways around that, but they're awkward.) If you want per-branch kinds of access controls, gitolite might be a better approach. There's also an inside-the-firewall version of GitHub if you care to spend the money. For our purposes, gitosis is fine because we have pretty open permissions on our repositories. (We have groups of people who have write access to groups of repositories, and everyone has read access to all repositories.) We use gitweb for a web interface.

As for some of your specific concerns:

  • merges: You can use a visual merge tool of your choice; there are instructions in various places on how to set it up. The fact that you can do the merge and check its validity totally on your local repo is, in my opinion, a major plus for git; you can verify the merge before you push anything.
  • GUIs: We have a few people using TortoiseGit but I don't really recommend it; it seems to interact in odd ways with the command line. I have to agree that this is an area that needs improvement. (That said, I am not a fan of GUIs for version control in general.)
  • small-group tracking branches: If you use something that provides finer-grained ACLs like gitolite, it's easy enough to do this, but you can also create a shared branch by connecting various developers' local repositories — a git repo can have multiple remotes.

We switched to git because we have lots of remote developers, and because we had many issues with Subversion. We're still experimenting with workflows, but at the moment we basically use it the same way as we used to use Subversion. Another thing we liked about it was that it opened up other possible workflows, like the use of staging repositories for code review and sharing of code among small groups. It's also encouraged a lot of people to start tracking their personal scripts and so forth because it's so easy to create a repository.

ebneter
Thank you! That's useful information. Do you have dependencies between/among code in different repositories? If so, how do you manage getting consistent versions across repos? Is there an easier way for two developers to figure out if they've got the same set of code than noting commit-ish's for each repo? BTW, I'm glad to hear about people tracking personal scripts and so on - I do that myself, along with "cheatsheets" of notes, tips and tricks.
Bob Murphy
Most of our code is java, and we use maven as our build system, so we use maven to handle inter-project dependencies and versioning. We also make extensive use of tags -- every release build has a tag.
ebneter
A: 

More suited for collabrative development than gitosis or gitolite but open-source is Gitorious. It's a Ruby on Rails application which handles management of repositories and merging. It should solve many of your problems.

Milliams
+15  A: 

Against the common opinion, I think that using a DVCS is an ideal choice in an enterprise setting because it enables very flexible workflows. I will talk about using a DVCS vs. CVCS first, best-practices and then about git in particular.

DVCS vs. CVCS in an enterprise context:

I wont talk about the general pros/cons here, but rather focus on your context. It is the common conception, that using a DVCS requires a more disciplined team than using a centralized system. This is because a centralized system provides you with an easy way to enforce your workflow, using a decentralized system requires more communication and discipline to stick to the established of conventions. While this may seem like it induces overhead, I see benefit in the increased communication necessary to make it a good process. Your team will need to communicate about code, about changes and about project status in general.

Another dimension in the context of discipline is encouraging branching and experiments. Here's a quote from Martin Fowlers recent bliki entry on Version Control Tools, he has found a very concise description for this phenomenon.

DVCS encourages quick branching for experimentation. You can do branches in Subversion, but the fact that they are visible to all discourages people from opening up a branch for experimental work. Similarly a DVCS encourages check-pointing of work: committing incomplete changes, that may not even compile or pass tests, to your local repository. Again you could do this on a developer branch in Subversion, but the fact that such branches are in the shared space makes people less likely to do so.

DVCS enable flexible workflows because they provide changeset tracking via globally unique identifiers in a directed acyclic graph (DAG) instead of simple textual diffs. This allows them to transparently track the origin and history of a changeset, which can be quite important.

Workflows:

Larry Osterman (a Microsoft dev working on the Windows team) has a great blog post about the workflow they employ at the Windows team. Most notably they have:

  • A clean, high quality code only trunk (master repo)
  • All development happens on feature branches
  • Feature teams have team repos
  • They do regularily merge the latest trunk changes into their feature branch (Forward Integrate)
  • Complete features must pass several quality gates e.g. review, test coverage, Q&A (repos on their own)
  • If a feature is completed and has acceptable quality it is merged into the trunk (Reverse Integrate)

As you can see, having each of these repositories live on their own you can decouple different teams advancing at different paces. Also the possibility to implement a flexible quality gate system distinguishes DVCS from a CVCS. You can solve your permission issues at this level too. Only a handful of people should be allowed access to the master repo. For each level of the hierachy, have a seperate repo with the corresponding access policies. Indeed, this approach can be very flexible on the team level. You should leave it up to each team to decide wether they want to share their team repo among themselves or if they want a more hierachical approach where only the team lead may commit to the team repo.

Hierachical Repositories

(The picture is stolen from and served by Joel Spolsky's hginit.com.)

One thing remains to be said at this point, even though DVCS provides great merging capabilities, this is never a replacement for using Continous Integration. Even at that point you have a great deal of flexibility: CI for the trunk repo, CI for team repos, Q&A repos etc.

Git in an enterprise context:

Git is maybe not the ideal solution for an enterprise context as you have already pointed out. Repeating some of your concerns, I think most notably they are:

  • Still somewhat immature support on Windows (please correct me if that changed recently)
  • Lack of mature GUI tools, no first class citizen vdiff/merge tool integration
  • Inconsistent interface with a very low level of abstractions on top of its inner workings
  • A very steep learning curve for svn users
  • Git is very powerful and makes it easy to modify history, very dangerous if you don't know what you are doing (and you will sometimes even if you thought you knew)
  • No commercial support options available

I don't want to start a git vs. hg flamewar here, you have already done the right step by switching to a DVCS. Mercurial addresses some of the points above and I think it is therefore better suited in an enterprise context:

  • All plattforms that run python are supported
  • Great GUI tools on all major plattforms (win/linux/OS X), first class merge/vdiff tool integration
  • Very consistent interface, easy transition for svn users
  • Can do most of the things git can do too, but provides a cleaner abstraction. Dangerous operations are are always explicit. Advanced features are provided via extensions that must explicitly be enabled.
  • Commercial support is available from selenic.

In short, when using DVCS in an enterprise I think it's important to choose a tool that introduces the least friction. For the transition to be successful it's especially important to consider the varying skill between developers (in regards to VCS).


Reducing friction:

Ok, since you appear to be really stuck with the situation, there are two options left IMHO. There is no tool to make git less complicated, git is complicated. Either you confront this or work around git:

  1. Get a git introductory course for the whole team. This should include the basics only and some excercises (important!).
  2. Convert the master repo to svn and let the "young-stars" git-svn. This gives most of the developers an easy to use interface and may compensate for the lacking discipline in your team, while the young-stars can continue to use git for their own repos.

To be honest, I think you really have a people problem rather than a tool problem. What can be done to improve upon this situation?

  • You should make it clear that you think your current process will end up with a maintainable codebase.
  • Invest some time into Continous Integration. As I outlined above, regardless which kind of VCS you use, there's never a replacement for CI. You stated that there are people who push crap into the master repo: Have them fix their crap while a red alert goes off and blames them for breaking the build (or not meeting a quality metric or whatever).
Johannes Rudolph
Like the "benevolent dictator," this workflow appears to require human intervention to make it work, and suffers from the same flaw for our situation: we don't have enough bandwidth to do our regular jobs, let alone futz with source control. Also, I was explicit: WE ARE STUCK WITH GIT. Unless I want to start a fistfight. :-)
Bob Murphy
Someone wrote in an article about that Microsoft workflow, that it can take months before feature from one branch get reverse integrated into everyone's working copies. This merging very painful and error-prone.
Glorphindale
@Glorphindale: I read about that in an article too, and no, their merging is not painful. They use DVCS to and since they work on clearly seperated boundaries merging is not as painful as you may think.
Johannes Rudolph
+1  A: 

I'll add in a "have you considered" post too.

One of the great things about Bazaar is its flexibility. This is where it beats all the other distributed systems. You can operate Bazaar in centralized mode, distributed mode, or get this: both (meaning developers can choose which model they're comfortable with or which works best for their workgroup). You can also disconnect a centralized repository while you're on the road and reconnect it when you get back.

On top of that, excellent documentation and something which will make your enterprise happy: commercial support available.

Wade Williams
As I mentioned, we're stuck with git.
Bob Murphy
+1  A: 
  • Install a decent web interface, like Github FI
  • Stick to a relatively centralized model (initially) to keep people comfortable.
  • Run a Continuous Integration build for every shared branch.
  • Share a good set of global git config options.
  • Integrate git into your shell, with bash completion, and a prompt with the current branch.
  • Try IntelliJ's Git Integration as a merge tool.
  • Make sure you .gitignore as appropriate.
retronym
A: 

Regarding points 3 & 4 (per-user, per-section, per-branch permissions), have a look at gitolite (covered in the Pro Git book: http://progit.org/book/ch4-8.html).

Politics or not, Git is as good a choice of a DCVS as any. Like any powerful tool, it is worth spending a little bit of time up front in understanding how the tool is designed to work, and, to this end, I highly recommend the Pro Git book. A couple of hours spent with it will save lots of frustration in the long run.

Jeet
+1  A: 

We recently switched from svn to git. Because git-daemon doesn't work with msysgit we opted for a central repository approach on a Linux server with gitosis.

To eliminate the possibility to screw up master we simply delted it. Instead we prepare all releases by merging the branches that are selected for testing and tag the merge. If it passes tests the commit is tagged with a version and put in production.

To handle this we have a rotating role of release manager. The release manager is responsible for reviewing each branch before it is ready for test. Then when the product ownder decides it is time to bundle the approved branches together for a new test release the release manager perform the merge.

We also have a rotating role of 2'nd level help desk and at least for us the workload is such that it is possible to have both roles at the same time.

As a benefit of not having a master it is not possible to add any code to the project without going through the release manager so we discovered directly how much code that was silently added to the project before.

The review process starts with the branch owner submiting the diff to reviewboard and putting up a green post-it on the whiteboard with the branch name (we have a Kanban based workflow) under "for review", or if it's part of a completed user story, move the entire story card to "for review" and put the postit on that. The relase manager is the one who moves cards and post-its to "ready for test" and then the product owner can select which ones to incled in the next test release.

When doing the merge the release manager also makes sure that the merge commit has a sensible commit message which can be used in the changelog for the product owner.

When a release has been put in production the tag is used as the new base for branches and all existing branches are merged with it. This way all branches has a common parent which makes it easier to handle merges.

John Nilsson