views:

218

answers:

4

I have a whole bunch of pairs of files that have subtle differences. We use subversion for a source control, and I like the Merge/Diff utility that comes with TortoiseSVN for windows. And I can use this utility to manually compare/merge two files together. My question is this: How can I programmatically merge two files together the same way this utility does (and ignore and flag files that have conflicts)?

+1  A: 

This may help: Automating TortoiseMerge

Michael Hackner
A: 

A few options I know about:

  • Sharp SVN as used by the ANKH source control provider for Visual Studio. It's slow to grab repositories however
  • DotSVN - '.Net port of Subversion'
  • If veering away from subversion isn't a problem, you could try Google's Diff Match Patch.

I'm not sure if the two subversion projects implement merging, but I imagine it's a server command. The diff won't be and I'd recommend the Google one for that.

I may be way off the mark if you just want to automate tortoise.

Chris S
+2  A: 

I would suggest using one of the .NET libraries that support an established merge algorithm, such as suggested in this question: http://stackoverflow.com/questions/138331/any-decent-text-diff-merge-engine-for-net

No idea about quality, I also stumbled across this: http://razor.occams.info/code/diff/

Godeke
No success with anything mentioned on that question, and the link you included has a blatant message on it indicating that the code is buggy.
Josh Stodola
If you know your data sets, it may be simpler to hand tune something yourself. (In another comment you say that they are "mostly inserts", for example). I did run across a guy who build a diff utility and encapsulated the diff itself in a library: http://www.menees.com/index.html
Godeke
A: 

Merging two files where there is no conflict, without human intervention, is probably not a good idea, unless you can easily tell what the change actually is, such as having the before-version of both files (like Subversion does.)

For instance, given the following two files, what is the correct course of action? (The # column is the line number, gaps means lines are missing)

--- File #1 ------------------------        --- File #2 ------------------------
1   This is the first line                  1   This is the first line
2   This is the second line                 2   This is the second line
3   This is the third line
4   This is the fourth line                 3   This is the fourth line
                                            4   This is the fifth line
5   This is the sixth line                  5   This is the sixth line

So, what is the correct result here?

Lasse V. Karlsen
Sorry, I should have mentioned that I do have "before versions" just like Subversions, that's why I figured I could tap into the algorithm because it is a perfect fit.
Josh Stodola
In the example you provided, there would be a conflict on lines 3, 4, and 5, correct? So this file would not get merged, but flagged for human intervention.
Josh Stodola
Also, I want to mention that 95% of the time, the modifications are going to be inserted lines, so conflicts should be rare.
Josh Stodola