views:

599

answers:

7

Can someone explain to me how to solve the substring problem iteratively?

The problem: given two strings S=S1S2S3Sn and T=T1T2T3Tm, with m is less than or equal to n, determine if T is a substring of S.

+3  A: 

Here's a list of string searching algorithms

Depending on your needs, a different algorithm may be a better fit, but Boyer-Moore is a popular choice.

Hank Gay
+1  A: 

It would go something like this:

m==0? return true
cs=0
ct=0
loop
    cs>n-m? break
    char at cs+ct in S==char at ct in T?
    yes:
        ct=ct+1
        ct==m? return true
    no:
        ct=0
        cs=cs+1

end loop
return false
spender
That should be `ct == m` not `n`. It also falls to the same problem that several other answers have - it returns false if `T` is an empty string, which is not correct.
caf
ok. I fixed it.
spender
+1  A: 
if (T == string.Empty) return true;
for (int i = 0; i <= S.Length - T.Length; i++) {
    for (int j = 0; j < T.Length; j++) {
     if (S[i + j] == T[j]) {
      if (j == (T.Length - 1)) return true;
     }
     else break;
    }
}
return false;
Botz3000
continue looks to be redundant here
spender
Ok fixed it. but a good compiler doesn't make a difference there anyway.
Botz3000
You have a fencepost error in your first `for` loop, the condition should be `i <= S.Length - T.Length`.
caf
Oh and your logic also fails for the case where T is an empty string (an empty string is a substring of any other string).
caf
pfff... there.
Botz3000
+2  A: 
ph0enix
This seems to be the first completely correct answer ;)
caf
+2  A: 

A naive algorithm would be to test at each position 0 < in-m of S if Si+1Si+2Si+m=T1T2Tm. For n=7 and m=5:

i=0:  S1S2S3S4S5S6S7
      | | | | |
      T1T2T3T4T5

i=1:  S1S2S3S4S5S6S7
        | | | | |
        T1T2T3T4T5

i=2:  S1S2S3S4S5S6S7
          | | | | |
          T1T2T3T4T5

The algorithm in pseudo-code:

// we just need to test if n ≤ m 
IF n > m:
    // for each offset on that T can start to be substring of S
    FOR i FROM 0 TO n-m:
        // compare every character of T with the corresponding character in S plus the offset
        FOR j FROM 1 TO m:
            // if characters are equal
            IF S[i+j] == T[j]:
                // if we’re at the end of T, T is a substring of S
                IF j == m:
                    RETURN true;
                ENDIF;
            ELSE:
                BREAK;
            ENDIF;
        ENDFOR;
    ENDFOR;
ENDIF;
RETURN false;
Gumbo
This also fails for the case of `m == 0` (an empty string is a substring of any other string).
caf
That is some ugly-looking pseudocode.
Chris Lutz
A: 

I just did a series on string searching at my blog Programming Praxis.

programmingpraxis
+1  A: 

This may be redundant with the above list of substring algorithms, but I was always amused by KMP (http://en.wikipedia.org/wiki/Knuth–Morris–Pratt_algorithm)