There are many SCM systems out there. Some open, some closed, some free, some quite expensive. Which one (please choose only one) would you use for a 3000+ developer organization with several sites (some behind a very slow link)? Explain why you chose the one you chose. (Give some reasons, not just "because".)
views:
2760answers:
34Any DVCS (BitKeeper, git, Bazaar, Mercurial, etc) because being distributed will cut down the load on the central 'canonical' SCM server. The caveat is that they're fairly new technology and not many people will be familiar with their use.
If you want to stick to the older, centralized model, I'd recommend Perforce if you can afford it, or Subversion if you don't want to pay for Perforce. I'd recommend subversion over CVS because it's got enough features to make it worthwhile but is similar enough that devs who already know CVS will still be comfortable.
Git was written for the Linux kernel, which might be the closest example to such a situation you can find public information on.
I want to say git, but don't think a company of that size is going to be all Linux (Windows support for git still sucks). So go with the SCM that Linux used before git i.e. BitKeeper
Having worked at a few companies with 1000+ workers, I've found that by-and-large, they all use Perforce.
I've asked "Why don't you use something else? SVN? Git? Mercurial? Darcs?"- and they've said that (this is the same for all of the companies) - when they made the decision to go with Perforce, it was either that, or SourceSafe, or CVS - and honestly, given those three choices, I'd go with Perforce, too.
It's hard for 'more difficult' version control systems to gain traction with so many people, and a lot of the benefits of DCVS are less beneficial when you have the bulk of your software teams working within 18 feet of one another.
Perforce has a lot of API hooks for developers to use, and for a centralized system, it's got a lot of chutzpah.
I'm not saying that it's the best solution- but I've at least seen some very large companies where Perforce works, and well enough that it's almost ubiquitous.
I'd use any SCM that does not have pessimistic locking (http://davidtanzer.net/?q=node/118) mechanisms. Especially because you want people to be able to "edit" the same file at the same time in any sizable project.
Personally I'd choose SVN with some solution for distribution, but since in SVN you only submit what you change (which should be very little for each commit anyway), the network overhead is very small. Also the server load can be handled with more hardware to some point. I have not yet found the ceiling for hardware scaling when using SVN.
Other choices may include "git" which the Linux Kernel people use, but I don't really have any experience with that.
If you have such a large organization then do not mandate a single specific SCM.
I am sure they are not all working on the same code and it would be worth while to let the teams themselves choose what they are most comfortable with.
(You may need to provide some training so the understand how to choose between Git, SVN, some internal legacy system.)
Perforce
What I like about perforce say compared to CVS is that the branch management is must more sophisticated (but still reasonably easy) and you don't need to bug a central bureaucracy to create branches/labels and the like. In other words it allows to an individual team (or developer) to manage their source components how they like, before submission to a mainline centrally administered by someone else.
Oh, I'd also say it has one of the best GUIs out there whilst still having a 1st class citizen command-line interface. I normally hate GUIs but theirs works.
If they're all working on the same product, probably Perforce.
If there are lots of smaller projects (2 to 50), I'd run several Subversion (SVN) boxes.
Perforce is a decent system, and scales well.
I'm working at an organization of about 5000 employees, and perforce is a fast and efficient system. It scales well, has good branch support, has atomic commits. Each change has a change number that can be used for "software archaeology" and a host of other great features.
Additionally, it has good support for windows, mac and unix, including good command line and has good script support.
I've used CVS before, and it doesn't scale well to groups greater than about 25-50 engineers (mostly because of atomic operations and performance)
I'd recommend Subversion because it is free and has great integration with most IDEs. For those in the organization who travel or prefer a more distributed model, they can use Git for all their local work but still commit to the SVN repository when they want to share code. This is a great balance between a centralized and distributed system. See Git-SVN Crash Course to get started.
The final and perhaps most important reason to use SVN is TortiseSVN, a Windows client for SVN that makes accessing repositories a right-click away for anyone. At my company, this has proven a great way to give repository access to non-developers.
A year later, I would now recommend Mercurial with TortoiseHG. It is the best combination of Windows support and distributed version control functionality.
If you mean 3000+ developers wokring on the same codebase, i've got no clue. If you mean working on several hundred projects on different locations and you need to have a standard, I'd go for somethins popular with a massive online user support i.e not something obscure that gives you 10 hits on Google.
Personally I would settle for SVN, I'm on an IT dep with several hundreds of devs and the preferable source control app there is SVN.
:)
//W
I would use bitkeeper. I've used bitkeeper, clearcase, accurev, perforce, subversion, cvs, sccs and rcs, and out of all of those bitkeeper was far and above the best. I've toyed with git and was impressed by its speed, but I thought its UI was a little cumbersome (though that opinion was formed after only using it for a couple of half-days).
bitkeeper has rather clunky looking GUIs but they are exceptionally functional. The bitkeeper command line tools are arguably best-of-breed and its merge capabilities were absolutely fantastic.
What I most liked about bitkeeper (and this is probably true of all distributed systems) is that branches were dirt cheap. Creating branches was a way of life rather than something to dread.
Subversion is easy to scale and split up. Perforce costs thousands of dollars for only a handful of employees, way to expensive, and besides, it offers nothing that subversion does not offer.
Subversion is really easy, better than cvs.
I would have recommended git if only their windows support was better
in our company we use alienbrain but we are migrating to Perforce. Perforce has everything you want: it hadles code and data, he integrates tools for continuous integration, it handles local (per developer) repository so you can check-in in your local repository before committing on server.
I vote for Perforce
First, big NO on CVS. Using CVS in 2008 is like driving a 92 Isuzu Trooper. The only reason they are on the road, and that people spend money to maintain them, is for purely sentimental reasons. CVS is old hat, technology-wise, and you will regret it.
I'd generally steer away from open source tools in that size of a company, too. Subversion is an excellent little tool and is pretty solid, but on the off chance that you go down or run into a bug you were unaware of, the onus is on you to fix it while 3,000 people sit idle. Perforce is cheap when put in that perspective and I highly recommend it.
It surprises me how many people that purport to be SCM professionals go with 'free'. On the surface it looks great to managemnt but when you're under the gun it helps to have a high-quality support team on your side. When you get woken up at 3am on a Sunday because your team in Singapore can't do any work, you won't be thinking 'free' was a good idea.
Source control tools are mission critical, you're talknig about company assets and intellectual property. Do not skimp on source control tools, ever!
If you have 1000+ developers working on a single piece of software, you have the resources to invest in a lot of tooling of your own. Whatever you choose, you'll probably do plenty of work to adapt it to your situation.
Microsoft's Team Foundation Server is used within Microsoft on some very large teams, and the TFS team is working on making it scale up well. Also, the integration of source control & bug tracking is attractive. It's not cheap, and administration is enough of a hassle that it doesn't scale down well to small teams, but for your situation, you can afford those costs. You probably also want to be able to call on a large support organization like Microsoft has when you get in to trouble (but if you go with free software, then you have the option of doing that support in-house).
If you have 1000+ engineers in your company, but they are working on pieces of software that ship separately, I think you'd want to put each one on its own server. This makes performance scale better, as well as administration. I would insist on having just one technology for source control, however.
I would actually check out Team Foundation Server. It is a very good system that can scale and it is probably easy to get through internal it departments. I know it is Windows centric but you can use add-ons for Linux/Mac also and you can use proxies for some sites with slow connections.
And I would think about having 2 systems in a large organization, it may help getting the best in some separate cases.
Perforce and TFS are the only options that I know of. I know that both of them have been used on large scale projects within Microsoft. Vault may scale that big, but I don't know if it goes beyond 500-1000 users.
Perforce is proven to be scalable to 5000+ users on a single server at Google, see: Life on the Edge: Monitoring and Running a Very Large Perforce Installation
It would seem that many of the largest software companies use Perforce either exclusively or as their main SCM. For example: Adobe, Cisco, SAP, Symantec, EA, UbiSoft and Autodesk are all Perforce users. It's not perfect but it's still superior to SVN or TFS (Neither of which is bad in it's own right)
Perforce gets my vote as well. I've not used it on such large projects, but it's absolutely rock solid in my environment. It also has an impressive resume of large projects, as well.
[rumor]I've heard tell that Microsoft used it for Vista.[/rumor] Apparently it was a customised version for them, but it doesn't get much bigger than that.
I'm a Subversion fan but there are a lot of good open and closed source choices. I'd avoid CVS as it really doesn't stack up as a modern SCM (no atomic commits and such).
Someone will probably suggest SourceSafe. Avoid it like the plague. SourceSafe silently destroys history and causes no end of grief. A little googling will tell you more about that.
Subversion is mature and has a lot of good tools and IDE integration. It works well on most networks since it uses HTTP to access the repository.
I worked on a SCM conversion a couple of years ago and the best thing you can do is try them out. SCM vendors will give you demos and tech support for your evaluation.
Choosing a SCM is not an easy thing to do. It really depends on your codebase and workflow. Some systems handle huge codebases better then others. Some handle lot's of branches and merges better then others. Some are better for remote access then others. Some have more fine grained security models.
Get everybody who will interact with the system together and make a list of what you need/want. Get the demos and import your code into it and try it out. Choosing a SCM for a group that large is a major project and should be treated as such.
I would use Subversion. Subversion has been proven on many large, distributed, open-source projects with large developer communities. Also, the transactional nature of Subversion commits makes it ideal for situations where the connection may not be reliable.
- For a single project (a project which does not depend on other internal project and can deploy all by itself): Subversion
- For multi-projects environments (when several projects are developed by several teams and must be deployed together on a same platform): ClearCase UCM
UCM is a methodology which provide some policies for labeling, branching and merging workflow, and facilitate the factorization of a common configuration -- list of labels -- within one given team. It does not solve all problems by far, but for parallel development of several projects, it is a very solid foundation.
I would use AccuRev. I've used svn, cvs, clearcase (base, ucm), ccc/harvest, but none of them can beat AccuRev's strengths. "3000+ developer organization with several site"? you can use Accurev distributed solution (AccuReplica) for that - which mean you have one single master server and as many as you want replicas on remote sites (so those with the "slow link" won't suffer much)
Above all AccuRev brings a unique approach - a truly new concept/design/implementation of stream-based SCM tool. Not in the (bad) way ClearCase-UCM did that (because ClearCase "streams" were eventually branches), but in slick modern way.
The best is to try it yourself, I know that they offer a trial of 30 days with enough licenses to toy with the tool - try it and you won't want to consider other tools. My promise.
Have to wholeheartedly agree with not using CVS. For really large numbers of developers I would suggest using Perforce. For lower numbers, Subversion or TFS.
I doubt whether you have 3000 developers in your organisation all working on the same code base. I work for a medium-large software company, and we probably don't have that many in the entire company, but there are also many independent projects.
Internally some groups deliver releases to other groups to use in their products; this is not managed through a SCM system.
Our own group has its own SCM but there are only about 25 active developers. We use CVS, and to be quite honest it's not really up to it (we'd migrate but have a lot of scripts / commit hooks and other bits & pieces which need a lot of work to change). The problem with using CVS on a reasonable size code base is that many operations are very slow and involve locking other developers out.
Okay, outright disclaimer: I'm a developer for a company called MKS which makes a version control system for "enterprise" companies as part of a software configuration management platform called Integrity. Blah blah blah, obvious plug.
So I can't honestly answer the question.
However, I'd like to point out that people suggesting distributed version control are missing something screamingly important for large companies. For them, it's less important how much flexibility developers have when dealing with their version control system than it is that they have absolute control over every line of code that gets shipped. Regulatory conformance and audits are a way more central concern than how painful merges are.
A company with 1000+ developers wants to know that everybody is doing what they're supposed to do and that nobody is doing what they're not supposed to do, everything is tracked and managers get lovely reports and graphs they can paste into PowerPoint slides for their managers.
If a large company doesn't particularly care about those things, they're far more likely to leave it up to individual dev teams to figure out their own thing, in which case, 1000+ developers are using a hodge-podge of different tools based on whatever seemed most convenient at the time.
Let's see the options.
1 - Perforce. Used by lots of companies (as people said there) Adobe, Amazon, MS, Google Companies who grew, advanced, and depend on selling software everyday to put food on the table, that's their choice. I guess that's the way I would go if I needed a supported "global solution" for a multitude of sites, etc Good for Win/Linux (not sure about Macs though)
2 - SVN. Used by big teams as well, KDE uses it (huge, huge project) currently in revision 880,000 (yes!) Very practical for both Windows and Linux usage (even though I would call TortoiseSVN below average in some aspects) Commercial support can be contracted as well. Good for Windows / Linux / Macs as well.
3 - Accurev If I was trying to be "edgy". I wouldn't deploy it on the whole company without some testing and getting used to it first.
4 - MS Team Foundation It may be a good solution but I never tried and is probably Windows only.
5 - Git / Bzr / Hg - Bzr and Hg have their "tortoises" so, good for Windows (even though I'm not sure about maturity) Git would be linux only for the time being, even though it is VERY GOOD (and much better and easier to use than a couple of years ago).
I would NEVER, EVER, ABSOLUTELY No WAY JOSE use Clearcase. PERIOD It is a waste of everybody's money and time and sanity.
Steer clear of: CVS / Clearcase / anything older