views:

2034

answers:

4

We are considering moving from ClearCase to Subversion. The project has been there for a while (7 years) and there are three "major" versions (branches) that we actively support, plus some occasional fixes in older releases. The project is fairly large - around 2 mln lines of java code.

I am curious if there is someone that has done similar migration.

  • Will SVN be able to handle such a large project?
  • Does it make sense to migrate all historical versions/branches? Are the tools that could do it selectively?
  • How long will the migration process take for such a project and what is the effective way of working then the migration is in progress?
+1  A: 
  1. Yes, Subversion can handle very large projects. For example, all Apache projects are in one single Subversion repository with the subprojects being simple subfolders
  2. If it makes sense to convert all the history, that you have to decide yourself. But there are plenty of tools available. A good blog post can be found here.
  3. I don't know how long such a conversion takes. But you can try first with a small subset and measure the time.
Stefan
Contrary to what you think, Apache is not very large project, it is not even a large project. It only has about 30 contributors or so. It is a medium sized project. SVN is not really capable of handling large projects. The size of the code base is not really a deciding factor as much as the number of people who need to cooperate and the amount of interaction, branching and merging.
Jiri Klouda
Please have a look at the Apache repository (http://svn.apache.org/repos/asf/) before you make such false assumptions about the project size. The repository doesn't just host the Apache webserver project but *all* Apache projects, together with hundreds of committers.
Stefan
Size of the code base is usually not a problem for any source control system. With the exception of maybe ClearCase, which in previous versions had limit of 16 million of objects per vob, which meant any version or label applied to any version. You needed dozens of vobs for large projects or start deleting old versions. Not that it is relevant to this particular question, but what you say about Apache still does not make it a large project, just large codebase. SVN is perfectly capable of handling large codebase, just not really large numbers of contributors and complex development process.
Jiri Klouda
+3  A: 

if you decide to move, you can look at this stackoverflow question.
recommendation-on-tools-to-migrate-from-clearcase-to-svn

Avram
+4  A: 

For having made several migrations of this kind, I would argue that:

  • you do not need to import all the history of the ClearCase versions into SVN. Most of the time (for my experience), only the labeled versions (the one which are applied consistently on all the files of a given set) are needed, unless you have a real need for a fine-grained history revision examen.

  • you need to think about reorganization during a migration: what do you import ?, what do you leave ?, and do you want the SVN content reflect exactly the structure of the files as stored in the ClearCase VOB ? Sometime, such migrations are the occasion to rethink some of those files organization (usually through simple renaming rules for certain directories).

  • the migration is quicker in the ClearCase 2 SVN way, since SVN is repository-centric and commit a set of files, while ClearCase is file-centric and commits file-by-file (much sloooower)

  • if the set of files to import is clearly identified, the migration process can be repeated multiple times over, which means you can go on working within ClearCase while the first (large) import is taking place, then put a Baseline (UCM label) on your code, and re-import only the delta, effectively ending the migration process.

VonC
+3  A: 

First some resources:

  1. Clearvision CC2SVN Tool
  2. SVN Importer by Polarion
  3. Article and resources on CollabNet

The size of the actual repository, number of files or their sizes are not a limiting factor for SVN. The number of developers, concurrency of changes, complexity of the integration and release process, need for merging and directory versioning ( refactoring ) could pose problems for a large project. If your project is just large, but it is fairly stable, with low number of developer, small number of branches and no need for backporting of tons of fixes to several prior releases, SVN should do just fine for you.

I have written a custom migration tool bringing data out of ClearCase and it is not easy task. Every two systems have different data models and operations over the data. I would not suggest to try to write any custom migration tool, because it is very hard to actually get data out of ClearCase in any meaningful way. For details on limitations of commercial solutions I would suggest to contact the solution providers linked in resources.

I personally would try to bring over as much data as possible, but you have to be aware of limitations of SVN compared to ClearCase. Any directory versioning ( refactoring ) history will likely get lost during this migration. SVN does not support sparse branches like ClearCase, which could bloat the size of your SVN repository in case you used task branches. In that case you probably want to limit yourself to system branches only. Files in ClearCase have individual branching structure, while SVN has branches sort of per product, which will result in a lot of branch translation in the process. By restricting yourself to system branches and maybe just labeled version on those branches for fully integrated labels in the series, you could save yourself a lot of trouble. In case your team is using UCM, you can pretty much forget all the UCM metadata. They will not translate into SVN.

The timeframe depends largely on the tools used. For a major project like you have it could be even weeks. ClearCase database has for some weird reason lots of locking even on reading operations and there is one central table of everything which creates a lot of problems in large scale access like migration would cause. The first time I run my tool on product somewhat larger than yours, we estimated it would run for 3 years, after much optimization, parallelization and incremental migration it cut down to about a week. But expect that depending on how well the tool is done, there could be a lot of variance in the time it takes. Although since you migrate into SVN and you will ignore a lot of the history and metadata from ClearCase, your migration should be much faster.

ClearVision, mentions at its pages that its CC2SVN tool can create a bridge between the two products. Although I did not use this tool, if it works as I assume, it would let you sync the 2 repositories after some processing, which would allow you some weekend switchover, with zero development downtime. If this is not possible try to ask for some alternative like incremental migration, where you first migrate up to some date, then migrate a smaller chunk of data changed since that date.

Very important part of the process is the post migration phase. Please do not discount the headaches the switch will bring to your developers. You must not underestimate the need for training and clear documentation. You will also need a trained support team in your software engineering department capable to operate both SCM systems and to explain to developers how to do things they were used to in the new system. This is actually a point that could break your neck in the migration. Developers resist any change and whatever advantages SVN brings to the project, it is in essence much inferior system. ClearCase gives your developers so much flexibility they will never have with SVN and unless you bring them on board early on in the process, you can lose them or worse, get the whole migration reversed, declared a disaster and lose your own job.

Jiri Klouda