The butterfly sorting result from Section 8 does not directly extend to the shuffle-exchange graph because the shuffle-exchange graph does not have the nice recursive structure possessed by the butterfly. However we can use the following theorem to sort on the shuffle-exchange graph. A similar theorem was proven independently by Raghunathan and Saran [28].
Proof: We assume that . Thus each row of each butterfly can be represented by a k-bit string, and each node of the shuffle-exchange can be represented by a 2k-bit string.
To map butterflies to the shuffle-exchange graph, we use the following lemma [15, Lemma 1-1,].
For each of these subsets we pick the lexicographically minimum string to represent the subset. We associate the butterflies in a two to one fashion with the representative strings. Suppose butterfly i is associated with string . We map a node in butterfly i to a shuffle-exchange node by shuffling the bits of with the bits of r's representation, and choosing the current bit to be . That is, node in butterfly i is mapped to shuffle-exchange node .
From a shuffle-exchange node we can recover the representative string by picking out every other bit and shifting to the lexicographically minimum string. We find the row string by picking out the other bits and shifting by the same amount. The position in the row is clearly the number of shifts we used to get to and the row number.
To finish, we observe that each edge in any of the butterflies is mapped to a path of length at most three in the shuffle-exchange graph since we either shift twice to reach 's image, or we exchange the current bit and shift twice to reach 's image.
Thus we can embed ( )-node butterflies in an N-node shuffle-exchange with load 2, congestion , and dilation 3.
This technique can be extended to prove that for any constant , distinct -node butterfly graphs can be embedded in an N-node shuffle-exchange with constant load, congestion, and dilation.
The following theorem shows that we can use Theorem 9.1 to sort on the shuffle-exhange graph in steps.
Proof: The N packets can be sorted using the Columnsort algorithm of [16]. Using Theorem 9.1, -node butterflies are embedded in the shuffle-exchange graph with constant load, congestion, and dilation, where . Each of the butterflies can sort packets in time using the butterfly sorting algorithm of Section 8. Columnsort sorts all N packets using a constant number of butterfly sorting steps and global permutation routing steps on the shuffle-exchange graph, each of which takes time, with high probability.