In 1965 Benes [6] showed that the inputs and outputs of
an N-node Benes network (two back-to-back butterfly networks)
can be connected in any permutation by a set of disjoint paths.
Shortly thereafter Waksman [40] devised a simple
sequential algorithm for finding the paths in time. Given the
paths, it is straightforward to route a set of packets from the inputs
to the outputs an N-node Benes network in any one-to-one fashion
in
steps using queues of size 1. A one-to-one routing
problem like this is also called a permutation routing problem.
Although the inputs comprise only
nodes in an N-node
Benes network, it is possible to route any permutation of N
packets in
steps by pipelining
such
permutations. Unfortunately, no efficient parallel algorithm for
finding the paths is known.
In 1968 Batcher [4] devised an elegant and practical
parallel algorithm for sorting N packets on an N-node
shuffle-exchange network in steps
using queues of size 1. The algorithm can be used to route any
permutation of packets by sorting based on destination address. The
result extends to routing many-one problems provided that (as is
typically assumed) two packets with the same destination can be
combined to form a single packet should they meet en route to their
destination.
No better deterministic algorithm was found until 1983, when Ajtai,
Komlós, and Szemerédi [1] solved a classic open
problem by constructing an -depth sorting network.
Leighton [16] then used this
-node network
to construct a degree 3 N-node network capable of solving any
N-packet routing problem in
steps using queues of size
1. Although this result is optimal up to constant factors, the
constant factors are quite large and the algorithm is of no practical
use. Hence, the effort to find fast deterministic algorithms has
continued. Recently Upfal discovered an
-step algorithm
for routing on an expander-based network called the multibutterfly
[37]. The algorithm solves the routing problem directly
without reducing it to sorting, and the constant factors are much
smaller than those of the AKS-based algorithms. In
[18], Leighton and Maggs show that the multibutterfly
is fault tolerant and improve the constant factors in Upfal's
algorithm.
There has also been great success in the development of efficient
randomized packet routing algorithms. The study of randomized
algorithms was pioneered by Valiant [38] who showed how to
route any permutation of N packets in steps on an
N-node hypercube with queues of size
at each node.
Valiant's idea was to route each packet to a randomly-chosen
intermediate destination before routing it to its true destination.
Although the algorithm is not guaranteed to deliver all of the packets
within
steps, for any permutation it does so with high
probability. In particular, the probability that the algorithm fails
to deliver the packets within
steps is at most
,
for any fixed constant k. (The value of k can be made arbitrarily
large by increasing the constant in the
bound.)
Throughout this paper, we shall use the phrase with high
probability to mean with probability at least
for any
fixed constant k, where N is the number of packets.
Valiant's result was improved in a succession of papers by Aleliunas
[2], Upfal [36], Pippenger
[26], and Ranade [29]. Aleliunas and Upfal
developed the notion of a delay path and showed how to route
on the shuffle-exchange and butterfly networks (respectively) in
steps with queues of size
. Pippenger was the
first to eliminate the need for large queues, and showed how to route
on a variant of the butterfly in
steps with queues of size
. Ranade showed how combining could be used to extend the
Pippenger result to include many-one routing problems, and
tremendously simplified the analysis required to prove such a result.
As a consequence, it has finally become possible to simulate a step of
an N-processor CRCW PRAM on an N-node butterfly or hypercube in
steps using constant-size queues on each edge.
Concurrent with the development of these hypercube-related packet
routing algorithms has been the development of algorithms for routing
in meshes. The randomized algorithm of Valiant and Brebner can be
used to route any permutation of N packets on a mesh in
steps using queues of size
. Kunde [14] showed how to route deterministically in
steps using queues of size
. Also, Krizanc, Rajasekaran, and Tsantilis
[13] showed how to randomly route any
permutation in
steps using constant-size queues.
Most recently, Leighton, Makedon, and Tollis discovered a
deterministic algorithm for routing any permutation in
steps using constant-size queues [19], thus
achieving the optimal time bound in the worst case.