Upfal's greedy algorithm starts by partitioning the packets into waves so that at most one packet in each wave is destined for any set of L contiguous outputs, where L is a power of 2. One way to do this is to group packets into the same wave if they are in the same permutation and their destinations are congruent modulo L. If there are P permutations to be routed, this results in the formation of at most PL waves. In general, we will set since then we will be guaranteed that at most packets in any wave will ever pass through the up (or down) edges of any M-input splitter of the multibutterfly. Since M is a power of 2 and is a power of , is an integer unless , in which case . This will allow us to apply the -expansion property to the set of inputs of any splitter occupied by the packets of a single wave at any time. (E.g., if k inputs of a splitter contain packets of a single wave that want to traverse up edges, then these inputs are connected to at least up outputs.) This is because packets going through the up (or down) splitter outputs can only be destined for the descendant set of contiguous multibutterfly outputs. For , an M-input splitter can receive at most one upward or downward destined packet, and we obtain expansion . (Note that for , we can replace an M-input splitter with an complete bipartite graph.)
The routing of the packets proceeds in stages, each stage consisting of an even and odd phase, and each phase consisting of 2d steps. In even phases, packets are sent from even levels to the next (odd) level, and in odd phases, packets are sent from the odd levels to the next (even) level. The edges within each splitter are colored with 2d colors so that exactly one edge of each color is incident on each input and exactly one edge of each color is incident on each output. In each phase, we process the colors in sequence, one step per color. For each color, we move a packet forward along an edge with that color if there is a packet in the switch at the tail of the edge that wants to go in that direction (up or down) and if there is no packet in the switch at the head of the edge. Alternatively, if there is a packet in the switch at the head of the edge and if it is in a later wave than the packet at the tail of the edge, then the two packets are swapped, so that the packet in the earlier wave moves forward. Note that every switch processes and/or contains at most one packet at any step.
If there is only one permutation to route, then each input of the multibutterfly starts with one packet. If there are P permutations to be routed, however, then it will be useful to augment the front-end of the multibutterfly with P-1 copies of the first level of the multibutterfly. Each switch on each of these levels starts with one packet. This will preserve the bound of queue size 1 at the input level, and it will ensure that these levels have the same -expansion property as the first level. For notational purposes, we will refer to these additional levels as levels .