views:

169

answers:

1

Is there any tool or utility or perl/python script that can find longest repeated substrings in a large text file and print those patterns and the number of times each pattern occurs?

+1  A: 

http://en.wikipedia.org/wiki/Longest_repeated_substring_problem:

The longest repeated substring problem is finding the longest substring of a string that occurs at least twice. This problem can be solved in linear time and space by building a suffix tree for the string, and finding the deepest internal node in the tree

The MYYN