where the 1 is the cost of the add. A similar rule is used for depth. The interesting rules concerning parallelism are the rules for an apply-to-each expression:
The first rule specifies that the work is the sum of the work of each of the applications of to , plus the work of , plus 1 to account for overheads. The rule for depth is similar, but takes the maximum of the depth of each application of . This supports our intuition that the applications are executed in parallel and the evaluation of the apply-to-each completes when the last call completes. The other interesting rules are the rules for an if expression, which for work is
with a similar rule for depth. The work and depth for a function call and for scalar primitives are 1 each. The costs of the NESL functions on sequences are summarized in Figure 6. We note that the performance rules can be more precisely defined using an operational semantics [7].
Figure 6: List of some of the sequence functions supplied by
NESL. The work required by each function is
given in the Work column: L(v) refers to the length of the
sequence v.
The work of the write(d,a) function actually depends on whether
the argument d needs to be copied or not, but in
the examples in this paper the difference has no effect.
Figure 7: Calculating the work and depth of
{factorial(n) : n in [3,1,5,2]}.
As an example of composing work and depth, consider evaluating the expression
{factorial(n) : n in a}
where a = [3,1,5,2]. Using the rules for work and the code for factorial given earlier, we can write the following equation for work:
where , , and are the work for ==, *, and -, and are all 1. The two unit constants come from the cost of the function call and the if-then-else rule. Adding up the terms and solving the recurrence gives . Since there is no parallelism in the factorial function, the depth is the same as the work. To calculate work and depth for the full expression {factorial(n) : n in a} we can use equations 1 and 2. This calculation is shown in Figure 7.