Since a subbutterfly does not begin to execute its algorithm until the larger butterfly at the previous level of recursion is finished, delay in excess of the time allotted to each butterfly accumulates over the course of the algorithm. An M-input butterfly is allotted time to perform its steps. However, Steps 6, 7, and 8 are not guaranteed to terminate in time . It is tempting to try to prove that these steps terminate quickly with high probability. This approach fails because at the lower levels of the recursion the problem size is so small that nothing can be ascertained with high probability. Instead we must argue that although delay may occur at any particular step, it is unlikely that a lot of delay will accumulate over a sequence of steps.
The delay from Step 8 is relatively easy to analyze. This step requires time; the delay depends only on the congestion. Lemma 8.5 bounds the probability that the congestion is large.
There are two possible causes of delay in Steps 6 and 7. A poor set of random rows for the packets can cause congestion at some node, which guarantees that some packet will arrive at its destination late. On the other hand, even if the congestion is small, a poor choice for the random ranks used by the scheduling algorithm may delay a packet. The following pair of lemmas bounds the probability that the delay from these steps is large. The first is a restatement of the Theorem 2.9 in a slightly different form. It bounds the probability that a packet will be delayed when the congestion is small. The second puts this bound together with the bound that the congestion is large from Lemma 8.5.
Proof: The proof of Theorem 2.9 shows that the probability that any packet arrives at its destination after step is at most
where and are constants. Suppose that . Then this probability is at most which is less than . Now let . Then the probability that any packet arrives at its destination after step (for some ) is at most , where .
Proof: For the sake of brevity, we examine Step 7 only. A similar analysis holds for Step 6.
We break the analysis into two cases according to whether the congestion is small or large. Let T be the time at which the last packet arrives. Then
We use Lemma 8.6 to bound the first term on the right. Plugging in for c yields . We use Lemma 8.5 to bound the second term on the right. Plugging in for c yields . Since , and , , and are constants, , for sufficiently large N.
The following lemma bounds the combined delay of Steps 6, 7, 8.
Proof: Step 8 can be performed deterministically in time . From Lemma 8.5 we have , for s > L. For our purposes, a weaker bound on this probability suffices. Since is a constant, there is a constant such that for sufficiently large L. Combining this bound with that of Lemma 8.7 yields the desired result.
To complete our analysis of the algorithm, we need to bound the probability that more than delay accrues during the sort.
Proof: The cumulative delay at the bottom level of the recursion is the sum of the delay at each of the butterflies on the branch of the recursion tree from the top level to the leaf. Let be the delay beyond at the ith level of the recursion. Then by Lemma 8.8. Notice that there is no dependence on i in this expression. Let D be the cumulative delay on a branch of the recursion from the top level to a leaf. Then . Generating functions help us here. The generating function for is
where can be thought of as a place holder. Since the delay at each level of the recusion is independent of the delays at other levels, we can sum the delay by multiplying the generating functions. Thus, the generating function for the cumulative delay is . The coefficient of in is at most . For , this coefficient is at most . For any , there is a such that is at most .
To bound the probability that the cumulative delay exceeds on any branch of the recursion, we sum the individual probabilities for all of the branches. There are at most N branches. Thus, the sum is at most . For any , there is a such that this sum is at most .