views:

106

answers:

1

I like to distinguish three different types of conflict from a version control system (VCS):

  • textual
  • syntactic
  • semantic

A textual conflict is one that is detected by the merge or update process. This is flagged by the system. A commit of the result is not permitted by the VCS until the conflict is resolved.

A syntactic conflict is not flagged by the VCS, but the result will not compile. Therefore this should also be picked up by even a slightly careful programmer. (A simple example might be a variable rename by Left and some added lines using that variable by Right. The merge will probably have an unresolved symbol. Alternatively, this might introduce a semantic conflict by variable hiding.)

Finally, a semantic conflict is not flagged by the VCS, the result compiles, but the code may have problems running. In mild cases, incorrect results are produced. In severe cases, a crash could be introduced. Even these should be detected before commit by a very careful programmer, through either code review or unit testing.

My example of a semantic conflict uses SVN (Subversion) and C++, but those choices are not really relevant to the essence of the question.

The base code is:

int i = 0;
int odds = 0;
while (i < 10)
{
    if ((i & 1) != 0)
    {
        odds *= 10;
        odds += i;
    }
    // next
    ++ i;
}
assert (odds == 13579)

The Left (L) and Right (R) changes are as follows.

Left's 'optimisation' (changing the values the loop variable takes):

int i = 1; // L
int odds = 0;
while (i < 10)
{
    if ((i & 1) != 0)
    {
        odds *= 10;
        odds += i;
    }
    // next
    i += 2; // L
}
assert (odds == 13579)

Right's 'optimisation' (changing how the loop variable is used):

int i = 0;
int odds = 0;
while (i < 5) // R
{
    odds *= 10;
    odds += 2 * i + 1; // R
    // next
    ++ i;
}
assert (odds == 13579)

This is the result of a merge or update, and is not detected by SVN (which is correct behaviour for the VCS), so it is not a textual conflict. Note that it compiles, so it is not a syntactic conflict.

int i = 1; // L
int odds = 0;
while (i < 5) // R
{
    odds *= 10;
    odds += 2 * i + 1; // R
    // next
    i += 2; // L
}
assert (odds == 13579)

The assert fails because odds is 37.

So my question is as follows. Is there a simpler example than this? Is there a simple example where the compiled executable has a new crash?

As a secondary question, are there cases of this that you have encountered in real code? Again, simple examples are especially welcome.

+2  A: 

It is not obvious to come up with simple relevant examples, and this comment sum up best why:

If the changes are close by, then trivial resolutions are more likely to be correct (because those that are incorrect are more likely to touch the same parts of the code and thus result in non-trivial conflicts), and in those few cases where they aren’t, the problem will manifest itself relatively quickly and probably in an obvious way.

[Which is basically what your example illustrates]

But detecting semantic conflicts introduced by merges between changes in widely separated areas of the code is likely to require holding more of the program in your head than most programmers can – or in projects the size of the kernel, than any programmer can.
So even if you did review those 3-way diffs manually, it would be a comparatively useless exercise: the effort would be far disproportionate with the gain in confidence.

In fact, I would argue that merging is a red herring:
this sort of semantic clash between disparate but interdependent parts of the code is inevitable the moment they can evolve separately.
How this concurrent development process is organized – DVCS; CVCS; tarballs and patches; everyone edits the same files on a network share – is of no consequence at all to that fact.
Merging doesn’t cause semantic clashes, programming causes semantic clashes.

In other words, the real case of semantic conflicts I have encountered in real code after a merge were not simple, but rather quite complex.


That being said, the simplest example, as illustrated by Martin Fowler in his article Feature Branch is a method rename:

The problem I worry more about is a semantic conflict.
A simple example of this is that if Professor Plum changes the name of a method that Reverend Green's code calls. Refactoring tools allow you to rename a method safely, but only on your code base.
So if G1-6 contain new code that calls foo, Professor Plum can't tell in his code base as he doesn't have it. You only find out on the big merge.

A function rename is a relatively obvious case of a semantic conflict.
In practice they can be much more subtle.

Tests are the key to discovering them, but the more code there is to merge the more likely you'll have conflicts and the harder it is to fix them.
It's the risk of conflicts, particularly semantic conflicts, that make big merges scary.

VonC
Part of the reason for having posed this question is that I am planning to give a presentation about source control generally and SVN in particular. One of the points I want to illustrate is that no software tools are a substitute for good planning or good communication. Another point I want to illustrate is the SVN does a good job, but it can't read your mind. That is why I want an artificially simple example. Thanks for your answer. I'm still hoping for further answers to or comments on my question...
Rhubbarb
@rhubbarb: understood, and I find your question very interesting. Regarding SVN, the true issue is of course merges: see http://stackoverflow.com/questions/2475831/merging-hg-git-vs-svn/2477089#2477089 and http://stackoverflow.com/questions/2471606/how-and-or-why-is-merging-in-git-better-than-in-svn/2472251#2472251
VonC