views:

95

answers:

2

After finding the difflib.SequenceMatcher class in Python's standard library to be unsuitable for my needs, a generic "diff"-ing module was written to solve a problem space. After having several months to think more about what it is doing, the recursive algorithm appears to be searching more than in needs to by re-searching the same areas in a sequence that a separate "search thread" may have also examined.

The purpose of the diff module is to compute the difference and similarities between a pair of sequences (list, tuple, string, bytes, bytearray, et cetera). The initial version was much slower than the code's current form, having seen a speed increase by a factor of ten. Does anyone have a suggestion for implementing a method of pruning search spaces in recursive algorithms to improve performance?

+6  A: 

The technique you are looking for is called memoization.

Mark Byers
A: 

If you have any expensive method that you are likely to call multiple times with the same parameters then you can just cache the result of the method, using the parameters as a key.

James Gaunt