I have been using the Python difflib library to find where 2 documents differ. The Differ().compare() method does this, but it is very slow - atleast 100x slower for large HTML documents compared to the diff command.
How can I efficiently determine where 2 documents differ in Python? (Ideally I am after the positions rather the actual text, which is what SequenceMatcher().get_opcodes() returns.)