The basic outline of the algorithm is the same as that of Flashsort. The first step is to randomly select a small set of splitters from among the packets that are to be sorted. Next the splitters are sorted deterministically. The splitters partition the packets into intervals. The ith interval consists of those packets whose keys are larger than the key of the st largest splitter, and smaller than the key of the ith largest splitter. (We assume without loss of generality that all of the keys are distinct.) Using the splitters as guides, each interval of packets is routed to a different subbutterfly, where it is sorted recursively.
We begin by describing a recursive algorithm for sorting packets in time on an -node butterfly, where is some fixed constant greater than one. The butterfly is ``lightly loaded'' by this factor of to ensure that, with high probability, at the lower levels of the recursion the number of packets to be sorted by each subbutterfly does not exceed the number of inputs to that subbutterfly. When the algorithm is invoked, each packet must reside at a distinct input. As we shall see, this algorithm can be combined with Leighton's Columnsort algorithm [16] to sort all packets in time.
The steps taken by a subbutterfly with M inputs are presented in some detail in Figure 6.
Figure 6: The steps
performed by an M-input subbutterfly in the recursive algorithm for sorting
packets in time on an -node
butterfly using constant-size queues.
The first step in the algorithm is to count the number of packets entering the subbutterfly. Since the packets reside in distinct inputs, the total number of packets can be computed via a parallel prefix computation. The prefix computation can be performed in time deterministically.
Next each packet independently chooses to be a splitter candidate with probability . As we shall see, with high probability the number of candidates is between and . This step requires only constant time.
The candidates are then sorted in time using a simple deterministic algorithm based on counting due to Nassimi and Sahni [25].
After the candidates are sorted, every th one in the sorted order is chosen to be a splitter. This oversampling technique, due to Reif, ensures that each of the intervals contains approximately the same number of splitters, with high probability. Note that we oversample by a factor of , where N is the number of inputs in the entire network, independent of the number of inputs, M, of the subbutterfly on which the algorithm is invoked. Since with high probability there are at least splitters, the subbutterflies at the next level of recursion should have at most inputs.
Next the splitters are distributed throughout the M-input subbutterfly so that they can direct each interval of packets to the appropriate smaller subbutterfly. We distribute a copy of the median splitter to each node in level 0 of the M-input subbutterfly. Then we divide the splitters into upper and lower halves. We distribute a copy of the median splitter from the upper half to each node in the upper half of level 1. Similarly, we distribute a copy of the median splitter from the lower half to each node in the lower half of level 1. The process continues in this fashion until all of the splitters are used up. At this point, every node in the first levels of the butterfly has a copy of a splitter. This step can be performed deterministically in time.
After the splitters are positioned, each packet is routed to a random row of the M-input subbutterfly. The packets are scheduled using the algorithm for routing on leveled networks.
Each interval of packets is then routed to a different smaller subbutterfly. This step is called splitter-directed routing [32]. The paths of the packets are determined as follows. At level 0, each packet compares itself to the median splitter. If it is larger, it moves to the upper half of the second level, otherwise it moves to the lower half. The process is repeated at the level 1, with each packet being directed to the appropriate quarter of level 2, and so on. The packets are scheduled using the algorithm for routing on leveled networks. When all of the packets have been routed along in the butterfly as deeply as the splitters are assigned, each subbutterfly at that level picks new splitters and proceeds recursively.
The last step before the recursive call is to position the packets in each smaller subbutterfly in distinct inputs. On an M-input butterfly where at most c packets enter each input, M packets can be distributed to distinct inputs deterministically in time.
The recursion continues until either the number of inputs, M, is smaller than , or the number of packets, n, is smaller than . In the first case, the sort is completed using Batcher's odd-even merge sort. An M-input butterfly can sort M packets in time using odd-even merge sort. For , the time is . In the second case, the packets can be sorted deterministically in time using Nassimi and Sahni's sorting algorithm, as in step four.
We can now make a rough estimate of the running time of this algorithm. Steps 1 and 2 are performed deterministically in time. Assuming that there are candidates, Steps 3, 4, and 5 also require time. As we shall see, the expected time for Steps 6, 7 and 8 is . Although these steps sometimes take longer than expected, let us assume for now that they do not. In this case, the running time is given by the recurrence
which has solution .