The first example is a function that finds all occurrences of a word in a string (a sequence of characters). The function string_search(w,s) (see Figure 6) takes a search word w and a string s, and returns the starting position of all substrings in s that match w. For example,
Figure: Finding all occurrences of the word w in
the string s.
The algorithm starts by considering all positions between 0 and #s-#w as candidates for a match (no candidate could be greater than this since it would have to match past the end of the string). The candidates are stored as pointers (indices) into s of the beginning of each match. The algorithm then progresses through the search string, using recursive calls to next_char, narrowing the set of candidate matches on each step.
Based on the current candidates, next_char narrows the set of candidates by only keeping the candidates that match on the next character of w. To do this, each candidate checks whether the i character in w matches the i position past the candidate index. All candidates that do match are packed and passed into the recursive call of next_char. The recursion completes when the algorithm reaches the end of w. The progression of candidates in the "foo" example would be:
20pt
Lets consider the complexity of the algorithm. We assume #w = m and #s = n. The depth complexity of the algorithm is some constant times the number of recursive calls, which is simply . The work complexity of the algorithm is the sum over the recursive calls, of the number of candidates in each recursive call. In practice, this is usually , but in the worst case this can be the product of the two lengths (the worst case can only happen if most of the characters in w are repeated). There are parallel string-searching algorithms that give better bounds on the parallel time (depth complexity), and that bound the worst case work complexity to be linear in the length of the search string [15, 44], but these algorithms are somewhat more complicated.