views:

38

answers:

3

I would like to mod an open source project held in SVN.

I would like to use Mercurial to hold my mod in version control. (The reason for Mercurial is that I would like to keep track of changesets so that I can split up the mod into components - this is necessary for working with the OpenCart project for example as it doesn't support extensions well).

When the open source project is updated I would like to merge the changes with my mod.

It would be ideal if the original project was held in DVCS as I could just fork the project and work from there but alas, SVN's familiarity is keeping its usage strong and this is fixed.

So my question is, what is the ideal workflow for this scenario and how do I implement it?

A: 

There is no "ideal" workflow when you change:

I believe the usual workflow in this case would involve patches:

This is illustrated in part in the Mercurial page "Working with Subversion Repositories".

When you want to produce a patch to the maintainers, it's simple:

 $ hg di -b -r last_svn_revision:your_tip > mybugfix.patch
VonC
@VonC The link you provided looks like exactly what I need. Cheers.
reckoner
+1  A: 

You might want to look into Mercurial Queues, or just use its non-vcs specific inspiration quilt.

Using mercurial with hg-subversion you could clone the svn repo as a mercurial repo, and then use an MQ (or just branches) against that to store your changes. When the upstream svn repo is updated you'll pull in your unmodified hg-subversion created clone and then merge in your modified clone.

Ry4an
I was thinking about MQ but hadn't directly mentioned it in my answer. +1
VonC
+1  A: 

Since your "upstream" repository is Subversion, you can use a combination of the Mercurial Queues, Rebase and Convert extensions to keep track of local modifications stacked on top of the upstream source.

The general idea is that you can pick a Subversion branch from the "upstream" repository and use the Convert extension to generate a local Mercurial clone with the history of the specific branch, e.g.:

 ________
(________)
|        |  Subversion repository
(________)

    |
    |    hg convert svn+ssh://host/repo/branch svn-branch
    |
    v
.------------------.
|                  |
|  svn-branch/.hg  |
|                  |
`------------------'

The convert extension can pull changesets from Subversion in an incremental manner in the future too, so you can set up a cron job that refreshes your local "clean" copy of the upstream code.

Then you can create as many local clones as you want from svn-branch, e.g.:

 ________
(________)
|        |  Subversion repository
(________)

    |
    |    hg convert svn+ssh://host/repo/branch svn-branch
    |
    |
    |                                     .-------------.
    |                            .----->  |  feature-1  |
    v                            |        `-------------'
.------------------.             |
|                  |     clone   |        .-------------.
|  svn-branch/.hg  |  -----------+----->  |  feature-2  |
|                  |             |        `-------------'
`------------------'             |
                                 |        .-------------.
                                 +----->  |  bugfix-1   |
                                          `-------------'

Once you set up your local clones you have two options for your own patches:

  • Commit your own changes as mercurial changesets in feature-1, feature-2 or bugfix-1 and keep 'merging' with the svn-branch upstream clone
  • Keep your own changes in feature-1 etc. as MQ patches, and rebase them every time an upstream set of changesets appears in the local svn-branch mirror.

Both approaches have their pros and cons.

Local changes with 'Normal' Changesets

If your intention is to just keep a local feature and you don't really care about sending a "clean patch" to the upstream developers, then periodic hg pull and hg merge runs are ideal. Repeated merges will be easy. Conflicts will be minimal. You can keep track of when, who, what was merged and why. More importantly: you can conveniently share and publish your locally modified clone with others. And so on.

With a local svn-branch/.hg repository that has a mostly linear Subversion history, repeated merges will look like this (changesets in parentheses are "local-only" commits):

[0] --- [1] --- [2] --- [4] --- [5] --- [7] --- [8] --- [9] --- [10]
                 \                \                               \
                  `-- (3) ------- (6) -------------------------- (11)

The local changes in changeset (3) are not visible to the upstream Subversion people, but you get to see at the local history that they were merged twice with upstream code: in commits (6) and (11). Since each merge is recorded as a normal changeset in Mercurial, it's easy to see who did the merge, when, what was merged, etc. It's also very easy to check what the local changes are at any merge point, e.g. by running:

hg diff -r 5:6
hg diff -r 10:11

You can even record the local changesets in a named branch, e.g. by committing the changes of (3) with:

hg branch feature-1
hg commit -m "Message"

Then looking at diffs from the 'upstream' vendor code may use the branch name:

hg diff -r default:feature-1

It's realy up to you how you will keep track of the local merges and how much local information you want to keep around.

Local changes with Mercurial Queues

If you are developing a local patch "in isolation", and you plan to submit the patch as a "clean diff" to the upstream Subversion developers, then MercurialQueues combined with the Rebase extension make it easy to keep your patches "on top" of the local svn mirror. The entire process of rebasing your local patches is often as easy as:

# Incrementally pull changesets from the upstream Subversion
# repository into a local hg clone:

  cd ~/work/svn-branch
  hg convert svn+ssh://host/repo/branch .

# Rebase the local patches of 'feature-1' on top of the
# latest subversion commits:

  cd ~/work/feature-1
  hg qpush -a && hg pull --rebase

Creating a local "patch queue" in one of the clones of your svn-branch mirror is the first step:

cd ~/work/feature-1
hg qinit -c

Then you can make your local changes, and save them as an MQ patch:

emacs src/foo.c
hg qnew --git --force --edit

You can create as many local patches as you want on top of the original svn-branch commits. For example my local FreeBSD 'head' mirror now includes the following patches:

keramida@kobe:/hg/bsd/src$ hg qseries -s
newvers-hg-support: Include the hg changeset number to uname output too.
kernconf-kobe: Add a kernel config file for my laptop, based on GENERIC
truss-style: Style nits for lines that are too long after recent truss changes.
yacc-core-dump: Fix a yacc(1) core dump reported by darrenr; patch by ru
loader-prompt: Lowercase the "OK" prompt of the boot-loader
top-rawcpu: Make top(1) use raw (non-weighted) cpu mode by default, like ps(1)
nogames-mtree: Fix `make installworld' when WITHOUT_GAMES=yes.
typo-fixes: Fix misc typos in source code comments & docs
mg-00-import: Import a snapshot of the mg(1) editor from OpenBSD
mg-01-freebsd-changes: Adapt the OpenBSD code of mg(1) to FreeBSD's environment
mg-02-build: Attach the mg(1) editor to the FreeBSD build process
regression-chmod: Add a few regression tests for chmod(1)
regression-stdtime: Add a regression suite for libc/stdtime functions
keramida@kobe:/hg/bsd/src$

It's certainly possible to keep a local stack of hundreds of changes. MQ is rather convenient for developing the patches locally, fine-tuning them, splitting them or joining them in larger patchsets, and when combined with the Rebase extension it's a powerful way of keeping your local patch set 'moving' along the upstream history.

The equivalent history changes for the patch of changeset (3) from the previous example would be something like this:

# History snapshot #1 - the local changes in (3) as an
# MQ patch P1 on top of changeset [2] from svn-branch:

[0] --- [1] --- [2] --- (P1)

Then when you pull a few more subversion commits into the svn-branch clone, you can rebase the local P1 patch on top of the latest svn code:

# History snapshot #2 - patch P1 rebased from [2] to the
# latest svn-base commit:

[0] --- [1] --- [2] --- [3] --- [4] --- (P1')
                  .
                   . (P1) . . hg rebase . ^

After a few more days, you convert more changes from subversion into svn-branch and you rebase once more:

# History snapshot #3 - patch P1' rebased from [4] to the
# latest svn-base commit, as a possibly very modified
# version, shown as patch P1'':

[0] --- [1] --- [2] --- [3] --- [4] --- [5] --- [6] --- [7] --- [8] ---  (P1'')
                                 .
                                   . (P1) . . . . hg rebase. . . . . . . . ^

You can keep rebasing your local patch as many times as possible. You can even rebase multiple patches in every iteration. You can insert patches in the 'middle' of a multi-patch queue. You can remove patches. You can join diffs, split them in more patches, re-arrange the stacking order of the patches. In general, you can rewrite and fine-tune the patch queue as many times and as often as you want.

Giorgos Keramidas
@Giorgos Thanks for your detailed answer. There are some interesting ideas there that I need to try out.
reckoner
Very interesting illustrations. +1
VonC
Why use rebase? Can't you just pop the patches, pull from your clone clone, and then push the patches back on? Patches don't store their parent revision, so they "rebase" automatically, right?
Ry4an
It's better to *rebase* the patches, because the changes you pull between the last patch 'base' revision and the latest upstream code may be touching code that is very near to the one modified by your patches. Popping and pushing again may fail with conflicts. The *rebase* extension handles this better by calling the normal "merge machinery" of Mercurial itself, to perform a 3-way merge of the patches on top of their new base revision (giving you e.g. a chance to resolve the conflicts in your favorite 3-way merge tool).
Giorgos Keramidas