views:

225

answers:

2

I've recently been working through a large codebase, refactoring and generally improving design to increase coverage. Also, in quite a few files I've removed excess using statements, moved methods so that similar functionality is close together, added regions etc. but not actually changed the functionality of the code in the file.

Meanwhile, elsewhere on the team other developers are fixing bugs and changing lines of code here and there. Obviously when it comes to merging this can be an issue since line numbers no longer match and methods may have moved.

Now, I understand the general rule that in a source controlled environment it can be a dangerous thing to move methods around, and we decided that the benefit outweighed the cost. What I don't understand however is why it should be this way.

Say that my initial file was a simple calculator:

public class Calculator
{
    public int Subtract(int a, int b)
    {
        return a + b;
    }

    public int Add(int a, int b)
    {
        return a + b;
    }
}

And I decided that I wanted the methods to be alphabetical:

public class Calculator
{
    public int Add(int a, int b)
    {
        return a + b;
    }

    public int Subtract(int a, int b)
    {
        return a + b;
    }        
}

While another developer fixed the bug in the subtract method

public class Calculator
{
    public int Subtract(int a, int b)
    {
        return a - b;
    }

    public int Add(int a, int b)
    {
        return a + b;
    }
}

A standard merge tool would probably require you to manually merge these two files, but one that understood the functionality of the code would easily be able to reconcile these two changes. The same applies to removing or adding other methods, comments, regions or using statements.

So, to (finally!) get to the question: Are there any merge tools out there that have an intelligent understanding of the functionality of code and could merge the two files above without any human intervention? If not, why not? Are there any complications which make this an unsolvable problem (of course a understand it isn't as simple as I'm implying - but is it impossible for some reason that I can't see?)

I uses C# in my source code and would love something that worked with that, but I'm interested in if this exists anywhere in the world of programming...


I'm already really concerned about the length of this question, but edited to add how I would expect the intelligent source system to work:

When the initial calculator file was checked in the system would parse the file and create a hierarchy of the class:

File: Calculator.cs
|
|--Class[0]: Calculator
    |
    |--Method[0]: Subtract
         |
         |--Line[0]: return a + b;
    |
    |--Method[1]: Add
         |
         |--Line[0]: return a +b;

(With extra lines in there for braces etc...)

When I check in my code (making the methods alphabetical) it updates the hierarchy above so that Subtract becomes Method[1] and Add becomes Method[0].

The second developer checks in his code (which obviously the source control system knows was based of the original) and notices the change to the first line in subtract. Now, rather than finding that line by line number in the overall file it knows that it can find it a Calculator.cs/Calculator/Subtract/0 and the fact that the method has changed location doesn't matter, it can still make the merge work.

+3  A: 

I think that Source Code in Database is one potential answer to your question. The general idea is that you don't version files, you version blocks of code. The versioning system knows about the code DOM, and lets you query on the code DOM in order to check out functions, classes, what-have-you, for editing, compiling, etc.

Since the order of the methods doesn't necessarily matter, they're not stored in the Database with any order in mind. When you check out the class, you can specify the order that you like best (alphabetical, public/protected/private, etc). The only changes that matter are the ones like where you switch the + to a -. You won't have a conflict due to reordering the methods.

Unfortunately, SCID is still VERY young and there aren't many tools out there for it. However, it is quite an interesting evolution in the way one views and edits code.

Edit: Here's another reference for SCID

dustyburwell
Thanks for the links, looks like the right idea and funnily enough when I was writing "With extra lines in there for braces etc..." in my edit above I actually thought "Screw the braces, let the source control system add them and end the tab/space wars" but didn't want to cloud the issue. Nice to see that we are heading that way even if we're not quite there yet.
Martin Harris
+3  A: 

Our approach with Plastic SCM is still far from being "complete", but it's already released and can help in this kind of situations. Take a look at Xmerge. Of course, feedback will be more than welcome and will grant some free licenses ;-)

pablo
Looks good, though obviously not 100% of the functionality I was hoping for (yet). Is the merge tool available separately? Unfortunately I don't have the power to organise a complete change of SCM system where I work, but a desktop version of that XMerge functionality could be very handy.
Martin Harris
Normally not, but drop us a line at our support email (support at codicesoftware.com) and we'll talk about it
pablo
And also check this: http://codicesoftware.blogspot.com/2010/07/xmerge-to-merge-refactored-code.html, it is our new Xmerge with even better refactor support. Hope you like it!
pablo