sequencematcher

Determine where documents differ with Python

I have been using the Python difflib library to find where 2 documents differ. The Differ().compare() method does this, but it is very slow - atleast 100x slower for large HTML documents compared to the diff command. How can I efficiently determine where 2 documents differ in Python? (Ideally I am after the positions rather the actual t...

SequenceMatcher for multiple inputs, not just two?

Hi everyone, wondering about the best way to approach this particular problem and if any libraries (python preferably, but I can be flexible if need be). I have a file with a string on each line. I would like to find the longest common patterns and their locations in each line. I know that I can use SequenceMatcher to compare line one...