tags:

views:

227

answers:

3

Every time I see a conflict on something like imports or method signature changes (e.g. renames of variables) in my SCM I wonder if there is something like a language aware diff/merge method that can handle the more annoying small changes that can happen on a shared project. Is there anything out there that handles conflicts more smoothly, working in a Unix environment?

+1  A: 

I agree that it would be awesome if such a tool exists, but there are none that I'm aware of. The reason I believe that there are none is because the merge algorithm for each SCM (whether it is git, hg, bzr, svn, etc) works on the lowest common denominator, which is simply plain text. For these SCM tools to really understand the language syntax and semantics, they would have to include the ability to parse the language. It seems like this is simply too big a task for any SCM to include the ability to parse Java, C#, Python, Ruby, Groovy, C, C++, etc., not to mention that each one of these languages have different syntaxes between version (e.g. Java generics did not exist until 1.5). So the SCM would have to include the ability to detect or be configured to know what language and version of the language the source code is written in.

I think that it would be more likely that any language-dependent merge feature would be found in a 3rd party merge tool (e.g. the merge > tool setting in .gitconfig and the ui > merge setting in .hgrc). This tool could be configured to know that any .java files in your project are written in Java 1.6 and then uses the parsing features in the JDK to generate the AST and perform some "deep" analysis of whether the change was meaningful in the context of that language.

John Paulett
Yes, that's what I mean by "merge command". But the question is still if there is something like that.
Marcus
A: 

You might want to look into having everyone on your team share the same IDE settings for things like order of imports, formatting, etc., to avoid conflicts like this from occurring in the first place.

matt b
This doesn't actually solve the problem. Consider for example some Java code "import a; import e;". Suppose I add "import b;" and you add "import c;", both in proper alphabetical order. When it comes time to merge, we will get a merge conflict. If we agree to put imports in alphabetical order, then there is no ambiguity about what the right merge is-- but the tools generate merge conflicts because they aren't aware of coding conventions.
Phil
A: 

I'm looking for the exact same thing. Those merge tools vendors should probably address this sort semantic, language-aware merge.. if not, I'll have to become one:)

For now, as a poor man's trick, I sometimes preprocess the 3 files (base, ours, theirs) to their 'canonical form' by feeding them through Eclipse's Code Cleanup/Organize Imports/Order Members.

Although limited, this works nicely: last time it reduced the number of conflicts to ~200 into 2. Am planning to wrap this into a script, and plug into git's merge tool.

Have also written script autoresolve java import conflicts, which simply keeps both side of the imports and adds comments to explain what's going on and what todo: 'organise imports'.

inger