searching strings for keywords: questions about the "failure function"

I've got a question on failure function description from "Compilers: Principles, Techniques, and Tools" aka DragonBook

Firstly, the quote:

In order to process text strings rapidly and search those strings for a keyword, it is useful to define, for keyword b₁b₂...b_n, and position s in that keyword , a failure function, f (s) ... The objective is that b₁b₂.. - b_f(s) is the longest proper prefix of b₁...b_s, that is also a suffix of b₁...b_s. The reason f (s) is important is that if we are trying to match a text string for b_lb₂..b_n, and we have matched the first s positions, but we then fail (i.e., the next position of the text string does not hold b_s+l), then f (s) is the longest prefix of b₁..b_n that could possibly match the text string up to the point we are at. Of course, the next character of the text string must be b_f(s)+1 or else we still have problems and must consider a yet shorter prefix, which will be b_f(f(s)).

So, the questions:
1. If we've matched s positions with the text, why f (s) is the longest prefix of b₁..b_n that matches the string? I think s - is the longest prefix.
2. Next character of the text string must be b_f(s)+1, why? We have a mismatch at this position, does it matter at all what the char is?

ansaurus

tags:

views:

answers:

searching strings for keywords: questions about the "failure function"

related questions