Is this asking for too much? It feels
like versioning systems are stuck in
stone age with our file/class based
versioning systems.
I feel exactly the same way.
While I have not found any tool to do this it would actually a very good project to start. First of all text/line based diff-merge it's actually usefull for text files. But programming languages code files are more than that. They are actually tree based documents, where the tree is the one generated by the parser (component from the compiler) which happens to be (sadly) encoded in a text file.
If the code file would actually be edited be a code-editor (I'm not talking about graphical code generators, but text but constrained to legal language constructs) where there is no way to have a syntax error because the editor just don't let you type something not making sense to the compiler (notice that it could still make no sense at runtime).
The code file would not be editable (or at least not recommended) in plain text editor but only from IDEs (I know some people will complain at this but a plain text editor is also an specialized editor for text, that knows about encodings ASCII... Unicode, UTF8, etc; instead of using an hex editor to write bytes to files.) then it is posiible to mantain a tree based document like for example other kind of documents like word, excel autocad, photoshop, etc filetypes.
Sadly I believe this is not gonna happen soon anyway at mainstream languages because of the same reason migrating from Office 2003 to 2007 is painfull. Because the two formats are incompatible. We could have a backwards compatibilty problem. But even in this case a parser just like the one in compilers could be used to mantain a tree version of the document.
Now that this is posible then the merge/diff tools will need to operate on not arrays of text lines, but on trees of language constructs. Thisway you can have a better way to compare different code files for example not seeing that line #150 on the left file has XXXXXXXXX and the right file has YYYYYYYY on the same line, but something like: in this part of your code (maybe something like a breadcrumb could be shown) in left file there is a call to method A with a,b,c arguments and in right side just with a,b.
Or maybe in left file the Method M()
was called but in right file N()
was called instead.
Another benefit from this kind of merge tools is that it would solve the problem with autoformatting document or with coding styles. If someone on your team use spaces but other uses tabs (4 spaces for each tab) and a crazy guy uses 3-spaced tabs and someone uses 0xOA 0x0D as newline and another one just 0x0A and someone write variable decalrations like this:
int a=0; int b=0;
Other guy like this:
int a = 0;
int b = 0;
Other crazy guy like this:
int a = 0;b=0
;
Or differences between this:
int A() {
}
Or this:
int A()
{
}
It just would matter because the merge would treat them equally and would say that there is no change. So many people would stop just editing files just to make the format look correctly according to them. (Ctrl+K, Ctrl+D-ing all the files in Visual Studio).
I beleive this wolud reduce the number of conflicts when doing merges. And not only that.
Since the merge tool know about the language (And posibly the text file is sintactically correct) then it would be posible for the merge tool to make only merges that could compile and not just mixing some characters in text based merges. If the produced merge does not compile then there is a conflict. Where you can have a better understanding of the diferences of meaning of the code not just difference in individual characters.
...
After doing some research I just found this answer that appears to be something like I was trying to explain: http://stackoverflow.com/questions/523307/semantic-diff-utilities/621557#621557