tags:

views:

1206

answers:

2

When comparing similar lines, I want to highlight the differences on the same line:

a) lorem ipsum dolor sit amet
b) lorem foo ipsum dolor amet

lorem <ins>foo</ins> ipsum dolor <del>sit</del> amet

While difflib.HtmlDiff appears to do this sort of inline highlighting, it produces very verbose markup.

Unfortunately, I have not been able to find another class/method which does not operate on a line-by-line basis.

Am I missing anything? Any pointers would be appreciated!

+2  A: 

difflib.SequenceMatcher will operate on single lines. You can use the "opcodes" to determine how to change the first line to make it the second line.

Adam
I'm afraid I don't quite understand this - yet anyway, so I'll do more digging.Thanks.
AnC
What exactly are you trying to do with the differences? Do you want HTML output or were you just using the HtmlDiff because it did in-line diffing?
Adam
While HTML output is my primary use case, HtmlDiff's output doesn't allow for easy reuse - that is, if it were simply inserting INS and DEL, that could then easily be transformed to whatever is needed further down the line.
AnC
+7  A: 

For your simple example:

import difflib
def show_diff(seqm):
    """Unify operations between two compared strings
seqm is a difflib.SequenceMatcher instance whose a & b are strings"""
    output= []
    for opcode, a0, a1, b0, b1 in seqm.get_opcodes():
        if opcode == 'equal':
            output.append(seqm.a[a0:a1])
        elif opcode == 'insert':
            output.append("<ins>" + seqm.b[b0:b1] + "</ins>")
        elif opcode == 'delete':
            output.append("<del>" + seqm.a[a0:a1] + "</del>")
        elif opcode == 'replace':
            raise NotImplementedError, "what to do with 'replace' opcode?"
        else:
            raise RuntimeError, "unexpected opcode"
    return ''.join(output)

>>> sm= difflib.SequenceMatcher(None, "lorem ipsum dolor sit amet", "lorem foo ipsum dolor amet")
>>> show_diff(sm)
'lorem<ins> foo</ins> ipsum dolor <del>sit </del>amet'

This works with strings. You should decide what to do with "replace" opcodes.

ΤΖΩΤΖΙΟΥ
Thanks very much for this!That's exactly the kind of sample I needed. I had no idea how to get started, but this illustrates it very well.Again, many thanks!
AnC