Can someone explain to me how to solve the substring problem iteratively?
The problem: given two strings S=S1S2S3…Sn and T=T1T2T3…Tm, with m is less than or equal to n, determine if T is a substring of S.
Can someone explain to me how to solve the substring problem iteratively?
The problem: given two strings S=S1S2S3…Sn and T=T1T2T3…Tm, with m is less than or equal to n, determine if T is a substring of S.
Here's a list of string searching algorithms
Depending on your needs, a different algorithm may be a better fit, but Boyer-Moore is a popular choice.
It would go something like this:
m==0? return true
cs=0
ct=0
loop
cs>n-m? break
char at cs+ct in S==char at ct in T?
yes:
ct=ct+1
ct==m? return true
no:
ct=0
cs=cs+1
end loop
return false
if (T == string.Empty) return true;
for (int i = 0; i <= S.Length - T.Length; i++) {
for (int j = 0; j < T.Length; j++) {
if (S[i + j] == T[j]) {
if (j == (T.Length - 1)) return true;
}
else break;
}
}
return false;
A naive algorithm would be to test at each position 0 < i ≤ n-m of S if Si+1Si+2…Si+m=T1T2…Tm. For n=7 and m=5:
i=0: S1S2S3S4S5S6S7 | | | | | T1T2T3T4T5 i=1: S1S2S3S4S5S6S7 | | | | | T1T2T3T4T5 i=2: S1S2S3S4S5S6S7 | | | | | T1T2T3T4T5
The algorithm in pseudo-code:
// we just need to test if n ≤ m
IF n > m:
// for each offset on that T can start to be substring of S
FOR i FROM 0 TO n-m:
// compare every character of T with the corresponding character in S plus the offset
FOR j FROM 1 TO m:
// if characters are equal
IF S[i+j] == T[j]:
// if we’re at the end of T, T is a substring of S
IF j == m:
RETURN true;
ENDIF;
ELSE:
BREAK;
ENDIF;
ENDFOR;
ENDFOR;
ENDIF;
RETURN false;
I just did a series on string searching at my blog Programming Praxis.
This may be redundant with the above list of substring algorithms, but I was always amused by KMP (http://en.wikipedia.org/wiki/Knuth–Morris–Pratt_algorithm)