This section presents a conservative tree contraction algorithm, Algorithm TC , based on the tree contraction ideas of Miller and Reif [22]. The algorithm uses a recursive pairing strategy to build a contraction tree for an input binary tree in much the same manner as Algorithm LC does for a list. With high probability, the height of the contraction tree and the number of steps on a DRAM are both , where n is the number of nodes in the input tree. A deterministic variant runs in steps and produces a contraction tree of height .
The recursive pairing strategy for trees is illustrated in Figure 5 for a tree with nodes A, B, C, D, E, and F. In the first step nodes A and B pair and merge, as do nodes C and D; the merges are shown as contours in the figure. A new contracted tree is formed from the unpaired nodes E and F, and the compound nodes AB and CD. In the next step of the algorithm, node E pairs and merges with CD to form a node CDE. After two more steps the 6-node input tree has been contracted to a single node. Notice that each node shown as a contour in the figure is a connected subgraph of the input tree, and that the node has at most two children in the contracted tree.
Algorithm TC represents the contours of Figure 5 in a contraction-tree data structure in the same manner as Algorithm LC represents the contours of Figure 3. Space for the internal nodes of the contraction tree is again provided by spares. Initially, the spare of each node in the input tree is its mate, an unused node stored in the same processor.
We now outline Algorithm TC in more detail. (A description in pseudocode can be found in [19].) In the first step, nodes in the input tree are paired. The pairing strategy has each node pick from among its neighbors according to how many children it has. A leaf picks its parent with probability 1. A node with exactly one child picks either its child or its parent, each with probability . A node with two children picks either child, each with probability . The root, which has no parent, picks its children with equal probability. If two nodes pick each other, then they merge. The merge is recorded by making the spare of the parent in the pair be the root of a new contraction tree. The spare of the child in the pair becomes the spare for the root, and the parent and child themselves become the children of the root. The new nodes and the unpaired nodes form themselves into a new tree that represents the contracted tree, upon which the algorithm operates recursively. The contracted tree is binary because the pairing strategy ensures that no node with two children pairs with its parent.
In the next section, we shall need to expand a contracted tree in order to describe treefix computations recursively. Expansion consists of undoing the merges in the reverse of the order in which they occurred. From the time that a parent and child merge to the time that the node representing their merge in the contraction tree expands, the pointers of the pair are undisturbed. Consequently, these pointers can be used to restore the pointers of the neighbors of the pair to the state they had immediately before the pair merged. To ensure that the merges are undone in the exact reverse order, as is assumed in the next section, it is helpful to store in each internal node of the contraction tree the step number in which the merge took place. In fact, the tree can be expanded by a greedy strategy without consulting the number of the contraction step at which each merge occurred.
Figure 5: The recursive pairing strategy operating on a
tree with nodes A,B,C,D,E, and F. Merged nodes are shown
as contours, and the nesting of contours gives the structure of the
contraction tree.
The proof that with high probability, Algorithm TC takes steps to contract an input binary tree to a single node requires three technical lemmas. The first lemma shows that in a binary tree, the number of nodes with two children and the number of leaves are nearly equal. The second lemma provides an elementary bound on the expectation of a discrete random variable with a finite upper bound. The last lemma presents a Chernoff-type bound [4] on the tail of a binomial distribution.
The final lemma presents a bound on the tail of a binomial distribution. Consider a set of t independent Bernoulli trials, each occurring with probability p of success. The probability that fewer than s successful trials occur is
The lemma bounds the probability that fewer than s successes occur in t trials when and .
With these lemmas we can now prove that with high probability, Algorithm TC takes steps to contract a rooted binary tree to a single node.
Proof: The proof has three parts. First, we use Lemma 0.6 to show that if a rooted binary tree has nodes, the expected number of nodes pairing with a parent in a single contraction step is at least . Next, we use Lemma 0.7 to show that the probability that at least nodes pair with a parent in any step is at least . Finally, we use Lemma 0.8 to show for any constant k, that after steps, for some constant , the probability that the tree has not contracted into a single node is .
We first show that the expected number of nodes pairing with a parent is at least , provided that . A child is picked by its parent with probability 1 when its parent is a degree-1 root, and otherwise. Thus, a leaf pairs with its parent with probability at least , and a node (other than the root) with one child pairs with its parent with probability at least . Let P be the number of nodes pairing with a parent. Then we have
and applying Lemma 0.6 yields the desired result:
Now we show that the probability that at least nodes pair with a parent in a single contraction step is at least . We call such a step successful. At most half of the nodes pair with their parents. Using Lemma 0.7 with , , and , we have
Finally, we show that with high probability, Algorithm TC takes contraction steps to contract the input tree to a single node. The size of the tree after a contraction following a successful pairing step is at most the size before the contraction. After successful steps, the tree must consist of a single node. By Lemma 0.8, the probability that fewer than successful steps occur in steps is
For any value k, we can choose large enough so that . In particular, for k = 1 a value of suffices.
We now prove that Algorithm TC is communication efficient in the DRAM model.
Proof: Each node of a contracted tree is a connected subgraph of the input tree. The root of the contraction tree that represents the subgraph is called the representative of the subgraph. The representative and its spare are each either a node of the subgraph or a mate of a node of the subgraph.
Every set of memory accesses performed by the algorithm is of one of two types. In the first type, each representative of a subgraph communicates with its spare, if at all. In the second type, each representative of a subgraph communicates with the representative of one of its children in the contracted tree. In either of these two cases, the set of memory accesses corresponds to a set of disjoint paths in the input graph, and hence, by the Shortcut Lemma, is conservative with respect to the input graph.
Tree contraction can be performed conservatively and deterministically on a DRAM with m processors in steps using the deterministic coin-tossing algorithm of Cole and Vishkin [5]. The key idea is that in Algorithm TC , the nodes that can pair form chains, and by Lemma 0.6 these chains contain at least half the tree edges. The chains can be oriented from child to parent in the tree, and deterministic coin tossing can be used to perform the pairing step in steps.