ansaurus

Question

diff'ing without grouping unrelated blocks

Answer 1

+2 A:

Although it is intentional that diff behaves like that you can change it by throwing in blank lines. This will get the result you want.

1:

hello world

lorem ipsum dolor sit amet

Same

2:

Hello World

Lorem Ipsum Dolor Sit Amet

Same

The line number has to be fixed though (n/2 + 1).

1c1
< hello world
---
> Hello World
3c3
< lorem ipsum dolor sit amet
---
> Lorem Ipsum Dolor Sit Amet

If multiple lines replace one line the output may still not be what you want:

1,3c1
< hello world
<
< lorem ipsum dolor sit amet
---
> Hello World

Thomas Jung 2010-01-26 11:57:29

Thanks - I have used this workaround before, but it's not a viable generic solution (see my response to mizipzor).I suppose the LCS problem explains why it is like it is, so I'll just have to live with it...

AnC 2010-01-26 15:51:52

Dont live with it, every software breakthrough starts with an annoyed programmer ;)

mizipzor 2010-01-26 16:21:20

Hehe - sadly, last time I delved into diff algorithms they made my head spin...

AnC 2010-01-26 17:34:12

Answer 2

+1 A:

The diff algorithm is a solution to the longest common subsequence problem. However, it seems youre not interested in another algorithm. Because, related or not, both lines have changed and what you are talking about is how the difference is presented in text.

Thomas Jung showed the original format. Wikipedia shows a few variations. But take the time to experiment some.

diff original new

Will produce the original format.

diff -c original new

Will produce the context format.

diff -u original new

Will produce the unified format. For some trivia, this is the one most commonly used, patches to open source projects are more often than not requested in this format.

Of course, if the way the difference is presented to you is crucial, I think you will find any of the diff viewers vastly superior.

mizipzor 2010-01-26 12:15:10

Thanks - I know about the different formats (I generally work with `git diff`), but they all present the same issue. This applies to both code and non-code (e.g. wikis) scenarios; minor changes - like indentation or typo corrections - can appear dramatic because it's not clear that each individual line just differs slightly from what it was before.

AnC 2010-01-26 15:47:01

Did you check the graphical viewers? Some of them does not only highlight the changed line but the changed characters in that line. I like that sometimes when the lines are a little to long, might help you as well. Also note that in most graphical viewers the lines are not "grouped together" in any way. They dont need to be since the change notification is usually a change in the lines background color.

mizipzor 2010-01-26 16:20:19

I have checked various different options - but take GitHub's diff visualization, for example; while that highlights inline changes, it only works if such changes are not on subsequent lines (i.e. blocks take precedence).

AnC 2010-01-26 17:01:22

have you considered writing your own? strictly line by line should be very simple to implement... could be something as simple as piping git diff output to a script you wrote!

mizipzor 2010-01-26 21:56:44

I've considered this - but that would mean it's limited to my local setup, and I've come to realize the issues I've mentioned are mainly of concern in a collaborative context (e.g. GitHub)...

AnC 2010-01-28 14:43:59

ansaurus

tags:

views:

answers:

diff'ing without grouping unrelated blocks

related questions