Okay, here's one crazy idea.
Create a function that determines if there is a recurring substring of length k in O(n) (where n is the length of the text). This can be done by using a rolling hash (see wikipedia for Rabin-Karp String Matching Algorithm) to generate all n hashes in linear time and using a hashtable/BST (or a map or dict or whatever your language calls it) to store the corresponding substring positions. Before inserting the current hash to the data structure, we check if we have seen it before. If it has been seen before, we simply look up the indices where this hash was generated and see if the corresponding substring is a non-overlapping match.
Now that we can find a k length substring in O(n) time, we simply run a binary search to find the largest k for which we can find a non-overlapping substring match using our function.
Code in Python follows
A=23
MOD=10007 # use a random value in the real world
def hsh(text):
return sum([ord(c)*A**(len(text)-i-1) for i,c in enumerate(text)])%MOD
def k_substring(text, k):
substrings={}
cur=hsh(text[:k])
pwr=(A**(k-1))%MOD
substrings[cur]=[0]
for i in xrange(k,len(text)):
to_remove=(ord(text[i-k])*pwr)%MOD
to_add=ord(text[i])
cur-=to_remove
if cur<0:
cur+=MOD
cur=(cur*A)%MOD
cur=(cur+to_add)%MOD
if cur in substrings:
lst=substrings[cur]
for index in lst:
if index+k<=i-k+1 and text[index:index+k]==text[i-k+1:i+1]:
return index
lst.append(i-k+1)
else:
substrings[cur]=[i-k+1]
return -1
def longest_substring(text):
hi,lo=len(text),0
while hi>lo:
mid=(hi+lo+1)>>1
if k_substring(text,mid)==-1:
hi=mid-1
else:
lo=mid
index=k_substring(text,lo)
return text[index:index+lo]
(Sorry if this is unclear. It is 6:30am here and I really need to go back to bed :) )