Day 20 3/30/98 ``It's just better if I start over.''
Standard algorithm design has several unit time operations:
|
(all with log(n) bits)
Also memory read and writes occur in unit time.
Similar operations except it is necessary to worry about whether reads and writes are concurrent or exclusive.
|
| ||
|
|
Consider a circuit arranged as a DAG from inputs to outputs. Define:
|
|
For a matrix multiply, the obvious method uses n3 processors and has O(log(n)) depth indicating total work around W = n3log(n)
By using only n3/log(n) processors it is possible to make W = n3 while increasing the time by a factor of 3.
adding matrices: time = O(1) , work = n2 , #processors = n2 .
All recursive calls are done in parallel so:
T(n) = T(n/2)+O(1) = O(log(n))
W(n) = 7W(n/2)+cn2 = O(n2.81...)
Å = binary associative operator
input: [a0,...an-1]
output: [a0,a0Åa1,...,a0Å...Åan-1]
Change the desired output slightly to be:
prescan = [I,a0,a0Åa1,...,a0Å...Åan-2]
The prescan can be computed by adding pairs in a tree, then doing a 'shift and subtract' from the root towards the leaves.
if #P = n time = O(log(n)) then W(n) = O(n*log(n))
if #P = n/log(n) then the time is still O(log(n)) but W(n) = O(n) .
The goal is to associate a number with each node in a list = the nodes depth in the list. We want to do this in parallel.
Wiley's algorithm: At each stage, the node points to a neighbor and n = number of steps to reach neighbor. General step: grab the neighbor of your neighbor and add n+neignbor(n) . In log(n) steps you are done. The algorithm is EREW .
Assign a random bit to each node ('female' or 'male').
Rule: if 'female' points to 'male', have the 'female' move to the neighbor of the 'male'
Reassign random bits and continue.
Probability that a node is removed in a step = 1/4 .
With high probability the algorithm runs in O(log(n)) .