Question: Sometimes the inner while loop of the Knuth-Morris-Pratt may execute Omega(log n) times. Show that this can occur. (A proof would take an arbitrary n0 and find some pattern of length n>n0 so that in processing some text string the inner loop will iterate at least c lg n0 times, for some constant c.)
Answer: Let our alphabet have k+1 characters a_1, a_2,..., a_(k+1). Let T(k) be the string S(k-1) followed by character a_k, where S(k) is the string S(k-1) followed by character a_k followed by the string S(k-1) and S(0) is the empty string. That is,
S(0) = empty string S(1) = a_1 S(2) = a_1 a_2 a_1 S(3) = a_1 a_2 a_1 a_3 a_1 a_2 a_1 S(4) = a_1 a_2 a_1 a_3 a_1 a_2 a_1 a_4 a_1 a_2 a_1 a_3 a_1 a_2 a_1 T(5) = a_1 a_2 a_1 a_3 a_1 a_2 a_1 a_4 a_1 a_2 a_1 a_3 a_1 a_2 a_1 a_5This is similar to the half- and quarter-inch marking on a ruler.
Using induction we can see that S(k) has length 2^k - 1, so T(k) has length 2^(k-1) - 1. Let us consider the next array for T(k). It will have -1 on every occurrence of a_1, a 0 on every occurrence of a_2, a 1 on every occurrence of a_3, a 3 on every occurrence of a_4. Generally, it will have 2^(i-2)-1 on every occurrence of a_i.
Consider how this will work with the text string S(k-1) followed by character a_(k+1). We will build up our pointer as we go to the right, until we finally reach the last character of the text string and the last character of the pattern. At this point the characters will not match (we have a_k in the pattern and a_(k+1) in the text string) so we will step our pattern pointer back to 2^(k-2)-1. Then the pattern pointer points to a_(k-1). We step back again to a_(k-2), then to a_(k-3), and so on until we finally reach a_1, when we step the pointer to -1. We go through the inner loop k times.
So, given an n0 I choose k to be two more than the value of lg n0 rounded up. I take the pattern to be T(k), with length 2^(k-1) - 1 >= 2^(1 + lg n0) - 1 = 2n0-1 >= n0. We can get into a situation where we go through the while loop k times, which is at least lg n0+2.