tags:

views:

415

answers:

4

Bizzaro-Diff!!!

Is there a away to do a bizzaro/inverse-diff that only displays the portions of a group of files that are the same? (I.E. way more than three files)

Odd question, I know...but I'm converting someone's ancient static pages to something a little more manageable.

A: 

You could try the comm command (for common). It'll only compare 2 files at a time, but you should be able to do 3+ with some clever scripting.

eduffy
comm only works for sorted files, I think.
Zac Thompson
+1  A: 

You could try sim. Been a few years since I've used it, but I recall it being very useful when looking for similarities within a file or in many different files.

joast
A: 

This is a classic problem.

If I had to quick-and-dirty it, I'd probably do something like a diff -U 1000000 (assuming a version of diff that supports it), piped through sed to just get the lines in common (and strip the leading spaces). You'd have to loop through all the files, though.

Edit: I forgot there is also Tcl implementation that would be slightly more versatile, but would require more coding. You may be able to find an implementation for your language of choice.

Zac Thompson
+1  A: 

You want a clone detector. It detects similar code chunks across large source systems. See our ClonedR tool: http://www.semdesigns.com/Products/Clone/index.html

Ira Baxter